TELECOM ACCESS STANDARDS NEWSLETTER NO. 135

OCTOBER 2002

CONTENTS
VOICE OVER IP TELEPERMIT ISSUES - SPECIAL EDITION

1. INTRODUCTION
2. ASSESSING VOICE QUALITY
3. SHARING THE IMPAIRMENTS
4. LOUDNESS
5. PUTTING IT ALL TOGETHER
6. DELAY AND PACKET LOSS
7. SUMMARY OF RECOMMENDATIONS
Appendix. LOUDNESS RATING IN AN "ALL-IP" OR "ALL DIGITAL" CALL
RETURN TO MAIN INDEX




1. INTRODUCTION

Quality of Service
As explained in Newsletter No. 134, which provided a relatively brief overview of the issues, the Quality of Service offered by Voice over IP private networks has been a major project for Access Standards in the past few months. This trend to VoIP is worldwide and it will result in calls being carried over mixed circuit-switched/VoIP paths for many years. This "mixed environment" is expected to be the most likely to incur degraded voice quality, especially while there is so much emphasis on cost-cutting in the present economic climate. Once "everything is IP", some of the impairments caused by mixed operation will no longer apply.

The VoIP mode of operation is being widely promoted, especially for larger corporate networks. This trend is based on claimed cost savings from "integrating" voice and data operations into a common network. Unfortunately, packet switching and transmission of voice calls can incur significant additional impairments when compared with current circuit-switching used in the PSTN and most private networks. Low bit rate encoding and packet handling can bring substantially increased delay and distortion. This has made compliance with Loudness Ratings a more significant issue, as covered in previous Newsletters.

Newsletter No. 134 explained that maintaining a good Quality of Service (QoS) with VoIP is going to assume a lot more importance for a network operator than in the past. Achieving acceptable quality for all calls becomes far more complex when two or more different networks are involved, as no single operator has full control. The overall QoS is determined by the sum of the impairments, so any network introducing impairments has an impact on the call as a whole.

Private networks are expected to be the "unknown factor". As long as most calls are still to or from the PSTN, with its traditional 64 kbit/s circuit-switching and low delays, any private network "stretching the limits" will usually achieve adequate QoS and keep their users happy.

In the longer term, unless all parties take care, problems will increase. If too many private networks "stretching the limits" are set up, there will be increasing probability of their experiencing poor quality end-to-end calls with one another. Initially, with few such networks in service, such poor calls will be infrequent. As the number of such networks grows, poor calls will become much more noticeable and complaints will be made. This was illustrated in the Figure published in Newsletter No. 134.

Like the old proverb, "it is the last straw that breaks the camel's back". Assuming each party claims that their network is not responsible, then which "straw" needs to be "lifted" to fix these problems?

Because QoS is such an important issue, this Newsletter aims to complement previous Newsletters in an effort to make sure that suppliers are aware of Quality of Service issues.

It is most important that excessive emphasis is not placed on cost reduction at the expense of call quality for other PSTN users, here and overseas.

Why so much emphasis on voice quality?
All this emphasis on voice quality has resulted from the continuing development and increasing introduction of Voice over IP networks and "IP add-ons" to conventional circuit-switched digital PABX systems. Telecom's PSTN will also, in time, convert to VoIP operation. However, this will employ very high speed "carrier grade, high performance routers" which incur relatively low delays.

IP operation is expected to bring extensive new features and services, along with economies of operation in the future. This will be especially so when we have "an all-IP" world.

However, while we have a "mix" of technologies, there is serious potential for degraded voice quality unless all concerned take reasonable steps to limit the potential impairments.

Setting up a circuit-switched call may be delayed due to congestion under heavy traffic conditions but, once connected, it remains connected for the duration of the call, and is subjected to little more than geographic delays due to the distance between the two ends of that call. VoIP calls tend to suffer increasing delay and/or packet loss if the network becomes congested and no form of voice prioritising has been implemented. Traffic congestion in circuit switched networks restricts new calls being established but does not affect voice quality. Traffic congestion in packet networks degrades the voice quality of all calls in progress at that time. Ongoing traffic management and timely provision of transmission capacity will be essential to maintaining VoIP voice quality levels.

The 64 kbit/s ITU Rec G.711 encoding (analogue to digital and digital to analogue conversions) used in circuit-switching incurs well under a millisecond of processing delay, with no equipment impairment (see below). In comparison, an IP call encoded to ITU Rec. G.729A - the currently preferred low bit rate encoding process - involves 15 milliseconds encoding delay, plus 10 -11 R-units of equipment impairment (distortion). Packetisation delays, packet handling delays over the IP network, and variations in those delays (jitter), add to the propagation delays due to the length of the call path. To make matters worse, if a non-preferred encoding scheme and/or transcoding is incurred, these bring increased distortion, packet loss, and other impairments, all of which affect the overall conversation quality.

These are distinct possibilities if cost alone is the deciding factor. Telecom stresses that network designers must consider the overall quality of a voice call, not just the cost.



2. ASSESSING VOICE QUALITY

The E-model
The ITU E-model, as described in Recommendation G.107, can be used to predict the subjective quality of a telephone call. The E-model is based on quantifying the various transmission impairments in a manner that permits them to be added together to assess the resultant overall end-to-end call quality. The result is expressed as an "R" value, which reflects the perceived quality of the connection. R values can readily be converted into traditional Mean Opinion Score (MOS) or percentage "Good or Better (GOB) / Poor or Worse (POW)" type measures.

The R value is assessed by the following formula:

R = Ro - Is - Id - Ie,eff + A

Where
Ro represents the basic signal-to-noise ratio (highly dependent on Overall Loudness Rating) and includes the effects of background noise, circuit noise, etc-;

Is covers "simultaneous" impairments to the voice signal (including too low values of overall loudness, non-optimum sidetone, and quantisation distortion;

Id covers delay impairments (talker and listener echo, loss of interactivity, etc);

Ie,eff covers special equipment-generated impairments, such as encoding by low bit-rate encoders, packet loss, etc, as mentioned above in relation to encoding schemes.

A is the "Advantage of Access Factor" which allows the planner to take into account the fact that customers may accept a decrease in quality (R value) to obtain an access advantage, e.g. to obtain mobility or to provide connections into very hard-to-reach regions.

High customer expectations will normally preclude the advantage of access factor A from being applied to connections which traverse the NZ public telephone network, including calls to and from mobile and private networks.

It could however be applied to connections which are entirely within a private network.

E-model input parameters include loudness ratings, delay, noise values, and equipment impairment factors. ITU-T provides equipment impairment and delay values for encoding, digital processing and packetisation. Typical values are provided as defaults for other model inputs.

It is clear from the E-model that you cannot have less than optimal loudness ratings, high delay, and low bit rate encoding all at the same time and still expect good voice quality.

Quality levels
As explained in Newsletter No. 131, the relationship between "R" value and subjective assessment is illustrated by the following Table:-

R Value User Satisfaction
90 Very Satisfied
80 Satisfied
70 Some users not satisfied
60 Many users dissatisfied
50 Nearly all users dissatisfied

As an example for comparison purposes, a circuit-switched 64 kbit/s local PSTN call with no other impairments and optimum Overall Loudness Rating has an R value of approximately 94.

Normally acceptable "PSTN Quality" is indicated by an "R" value of 70 and above. Without complicating matters too much, this means that there are only 24 R-units/Impairment units available to cover packet transmission and still maintain "PSTN quality". This margin is shared by all networks and CPE involved in making a connection, but most could easily be used up by just one network's characteristics. For example, non-optimal loudness ratings at one end could use up to 16 R-units; G.729A encoding alone uses up 10 -11 R-units; and a delay of 400 ms mouth-to-ear would use up the entire 24 R-units.

At the bottom end of the "quality scale", use of connections with R-values below 50 are not recommended by ITU-T, as this is considered to be an unacceptable quality level. This would allow each private network to use up as many as 22 R-units for a national call, excluding any impairments in the public network. In reality, the total impairments will have to be less than this, as designers should set out to achieve "PSTN Quality" as far as possible, not design to make "nearly all users dissatisfied!".

In all this, it is important to recognise that those most likely to experience poor quality are the customers on the private network concerned, as all of their calls would be subject to the excessive impairments!




3. SHARING THE IMPAIRMENTS

Getting a fair share
Unfortunately, while the ITU sets out relative performance objectives for end-to-end calls, recommends OLR and optimum Loudness Ratings for CPE, there is nothing agreed "internationally" as to how other impairments (equipment and delay) should be distributed between the various networks and CPE items involved in an end-to-end call. This is because the various impairments are all inter-active and some parameters impact on more than one impairment.

As a result, we are taking things carefully as far as our PTC requirements are concerned. We are trying to ensure that there is a balance between reasonable requirements for operating these systems and the need to avoid degrading quality of voice transmission for other customers calling into or being called from private networks.

There are many factors involved and these Newsletters will attempt to provide a simple explanation of the main ones in the context of private network design. This particular issue deals mainly with Loudness Rating, level and loss issues, delay and encoding aspects, but little on the subject of voice prioritisation. This will be dealt with in a future issue.




4. LOUDNESS

Loudness (Ro and Is) is primarily defined by the CPE at each end of the call in terms of the Loudness Ratings, along with the losses or gains in level over the call path. These are the factors which contribute to the Overall (end-to-end) Loudness Rating (OLR).

CPE Loudness Ratings
Our Newsletters and specifications have placed a lot of emphasis on achieving the optimum Loudness Rating of the telephones concerned. This seems the best starting point in any quest to achieve good overall transmission quality, as loudness has a major effect on perceived voice quality and optimum settings can usually (although not always) be achieved at no additional cost.

This is especially the case for digital and IP telephones, for which Telecom has adopted the now internationally agreed optimum Loudness Ratings (North American ANSI/EIA/TIA wireline loudness standards have now been harmonised with ITU-T loudness ratings). CPE suppliers are moving to the agreed standards and it is expected that more "international digital voice products" will soon be widely available here. This should help reduce costs and ensure a wider range of choice for our customers.

Overall (End-to-End) Loudness Ratings
The ITU-T has recommended that a nominal Overall Loudness Rating (OLR) of 10 dB be adopted for voice calls. OLR is given by the sum of the Send Loudness Rating (8 dB), the Circuit Loudness Rating (the sum of the losses and gains within the network) and the Receive Loudness Rating (2 dB).

Telecom's transmission planning is centred around achieving 10 ( 3 dB OLR for the a majority of calls. This currently takes into account the traffic-weighted mean analogue access line loss of 2.5 dB such that, while shorter lines may sound a little louder and longer lines may sound a little quieter, the loudness of most calls is generally around the most "comfortable level".

On an "all IP" or "all Digital" call (no analogue line or trunk losses to take into account), the network introduces neither gain nor loss, so the OLR is wholly determined by the telephones at each end of the call. This is illustrated in more detail in the Appendix 1.

"Soft" Telephones
A "soft" telephone can be loosely defined as software which allows a headset and a sound card associated with a PC to perform the same functions as a conventional telephone. In such cases, neither the sound card nor the headset are prescribed and anything could be used. There is unlikely to be any formal control of the Send Loudness, although the user will usually be able to alter the receive volume to suit personal preferences. This means the person at the other end of the call is at the mercy of whatever has been set up by the soft telephone user.

In view of the importance of ensuring optimum Loudness Ratings for telephones, it is not surprising that the concept of a "soft" telephone does not generate much enthusiasm for Access Standards.

Level and Loss plans To maintain or achieve optimum loudness, level adjustments are necessary at any interfaces between an IP or digital network and any analogue telephones or lines. Each national network has its own transmission plan to set out what is required, so products need to be "customised" accordingly. Digital and native IP telephones normally have the send level pre-set at the ITU-recommended optimum and are generally more straightforward, with just the receive level adjustable to the user's preference.




5. PUTTING IT ALL TOGETHER

Draft Specification PTC 220 (Private Voice Networks)
The general practice is for suppliers to provide a range of gains or losses at these interface points so that their products can be set up for the network concerned. This is the principle behind the draft supplement to PTC 107 (PABX External Port Interfaces) and PTC 217 (Bandwidth Management Devices), which is soon to be extended and published as the draft of PTC 220.

This develops the theme of "getting the loudness right" by optimising gain and loss settings in the various network and CPE interfaces. As with CPE Loudness Ratings, our contention is that if there is an adjustment facility, it may as well be correctly adjusted. It costs no more to get it right!

The draft specification sets out Telecom proposals on the basis of optimising OLR's for the long-term situation of an "all IP world".

Go digital!
In focusing on the ideal future situation, our proposals involve some compromises for those calls initially using "mixed 2-wire analogue and 4-wire digital/VoIP" operation. However, these compromises allow the overall loss and gain settings to be considerably simplified in relation to some overseas schemes. Another advantage is that either an analogue trunk or an analogue telephone interface can be converted later to digital (or IP) without impacting on the gain/loss settings of other parts of the private network.

We recommend that digital network interfaces with the PSTN should always be regarded as the first preference for digital PABX systems and private networks, whether these are circuit-switched or IP (see Item 2 of Newsletter No.129).

A draft specification supplement is published free of charge on our website (click on http://www.telepermit.co.nz/vopspec.html or "On-line Specifications" on the left hand side of our home web page). This document includes a series of sketches of the different interfaces involved and illustrates our concept of a "zero reference level" within an IP cloud being held at the same reference level as our PSTN core network.

Half-calls
The sketches in the draft specification supplement illustrate a series of "half calls". These can be viewed in combination to assess the Loudness Ratings for any combination of analogue, digital or IP telephone and trunk being involved in an end-to-end call. Note that the far end may be a similar private network, a conventional phone on an analogue line, a cellphone, or any other combination of line and CPE.

We have shown compensatory gains for the 0.5 dB and 6 dB "T" and "R" pads used within the Telecom network so that numbers add up accurately. However, it is not essential that compensation be provided for the 0.5 dB "T" pad, as compromises have been made in any case.

We appreciate that the proposed plan may not align with many manufacturers' current arrangements, so industry comment is welcomed on these proposals. Our aim is to optimise levels and our proposals are not necessarily the only way this aim might be achieved.




6. DELAY AND PACKET LOSS

Delay budgets
Given that the optimum Overall Loudness Rating, or close to it, can be achieved for all calls, one of the biggest changes with IP operation will be the potential for call delays in addition to those due to the geographic distance that the call traverses.

These added delays occur as the result of a number of factors:-

These factors all tend to be inter-dependent when it comes to assessing the overall quality of a call.

In a typical IP private network, these added delays could easily amount to 200 or more milliseconds unless some sort of control is exercised in the network design and voice packets are given priority. It is important to note that they are "one-way" delays (i.e. mouth-to-ear) and their impact is doubled in relation to normal "face-to-face' conversation.

Mouth-to-ear delays of more than 150 ms impact on inter-active speech (ease of conversation). This has always been evident on overseas calls involving satellite links, and is now likely to be more evident on international cable circuits and even on national and local calls where two or more networks are interconnected and each incurs significant delay. Such networks could include private IP networks, digital mobile networks, in any combination. Any added delay is particularly significant when overseas calls are involved, as these may already be subject to a long geographic delay. For example calls between NZ and Europe will already incur around 180 ms.

Delay limits
The ITU recommends that one-way delays end-to-end be kept below 150 ms for normal telephone conversations. Above this level, there is a tendency for "double talk", whereby one person may interrupt the other after short pauses.

This effect becomes quite unacceptable to most people if one-way delays exceed 400 ms.

Both of these limits are themselves conditional on there being no perceptible echo. Should echo be present, the issue of delay is far more critical. One way delays exceeding 15ms in the NZ network are likely to result in objectionable echo unless echo cancellation is applied. Telecom does not deploy echo cancellers on either local or national calls. This is why Telecom requires echo cancellation to be incorporated as a matter of course in any CPE or other equipment that introduces more than about 10 ms of extra delay into a call. This limit is expected to include all IP-based private networks.

Since it is very difficult to ensure that the encoding, packetisation, and variable packet handling delays incurred in VoIP operation will never exceed 10 ms, echo cancellation is always to be deployed.

Interim delay limits within a private network
Because most delays will vary according to the design of the private network, our draft PTC specification requires that testing is carried out on the equipment items with no intervening "network". This avoids the network delay and any variations in it so that the testing relates only to the equipment concerned.

Our proposed interim delay limit under the test conditions specified is 50 ms (excludes distance related propagation delays and bandwidth sharing with other connections). By the time other delays are added in a practical network, the total is then expected to be comparable to the 80-110 ms delay incurred by calls originating or terminating in a digital cellular network. Thus, local connections between two VoIP networks, each with a delay of around 100 ms, will give a total delay similar to that of a national call between two digital cellular customers.

In comparison, a typical circuit-based international call between PSTN customers in NZ and Europe, incurs a delay of around 180 ms over cable. An international call between digital cellular or IP customers in NZ and Europe will just have delays within the maximum ITU recommended limit of 400 ms.

It is recommended that private network designs should aim at a delay of substantially less than 100 ms, which should be seen as the exception rather than the rule. If you have delays of the order of 100 ms, then it is desirable to avoid the use of low bit rate encoding. If you do use low bit rate encoding, then it is desirable to avoid delays approaching 100 ms.

Jitter buffers and packet loss
As mentioned above, where any variation in overall delay can occur, it is usual to install a jitter buffer at the terminating end. These devices may be set for either a fixed delay or, preferably, may be set to vary the delay according to traffic conditions on the network concerned. The latter is preferable, as the buffer's setting adds to the usual delay across the network. A Jitter buffer allows packets to be received at irregular rates or even in the wrong order. Its purpose is to output packets in a smooth stream to the end user, using its inbuilt delay to "wait" for a late packet. Unlike data, which is relatively time-insensitive, and can have a packet re-sent if one is lost, voice requires the packets to arrive regularly, but can tolerate the odd lost packet. The jitter buffer provides a nominal margin for late packets, but if a packet is too late, it is simply dropped. Packet loss can be concealed to some extent, but can still cause substantial impairment of voice quality. As an example, 1% packet loss with G.729 encoding introduces around 5 R-units of impairment. The longer the jitter buffer delay, the less chance there is of a packet being dropped.

The buffer setting is thus a compromise between adding to the overall delay and losing packets. However, it is a network design issue to assess the traffic, the delay across the private network, the amount of jitter, and the impact of these factors on voice performance. These factors depend on call paths, the traffic carrying capacity of the network and the degree of prioritisation awarded to voice packets over data packets.

From the above, it will be seen that buffers need to be appropriately set up and that ample transmission capacity is provided. Additionally, it is important that traffic is monitored and that additional transmission capacity is provided at appropriate times as packet traffic grows.

User-initiated delays
What is of real concern is that there is scope for private networks to add a lot of extra delay. Even beyond what the network designer may have had in mind, there is the ability for a customer to increase delay even further. A simple example, is replacing a wired phone with a DECT phone in order to achieve a "wireless office". This would add a further 15 ms or so. Diverting a call to a cellular number from within the private network would be far worse and such changes may be customer-driven, as distinct from formally designed.

It is important that the CPE industry, as a whole, should try to make sure that customers are aware of these issues. Probably, this can be best done via user manuals and general customer advice when a VoIP network is being commissioned. Referring the more technically-minded clients to our Newsletters might be a suitable way of getting this message across.




7. SUMMARY OF RECOMMENDATIONS

Loudness and level plans
These are considered the first thing to get right, as achieving the optimum Loudness Ratings and compliance with the national transmission plan will leave the most room for other potential impairments, some of which cannot be completely avoided.

Encoding protocols Encoding to ITU-T Recommendation G.711 is Telecom's preferred approach, as this avoids both delay and additional impairments. However, it is appreciated that this may prove more expensive and use up limited capacity on some links, especially when packet overhead is taken into account.

Of the low bit rate encoding processes, ITU-T Recommendation G.729A is the preference. This introduces a lower delay (35 ms) due to coder-related processing (Ref ITU-T Rec. G.114) and fewer Equipment Impairments than G.723.1, with little difference in efficiency.

Network interfaces
At this stage, an ISDN digital interface is strongly recommended over analogue interfaces with the PSTN. In due course the new VoIP PSTN will offer packet interfaces.

Call paths
As explained above, digital interfaces with the public network are the preferred option. These ensure there are no unnecessary losses in transmission level.

To cover the issue of delay and other impairments, which will vary according to the type of call, the following approaches are recommended:-

  1. For calls wholly within a private network, the customer concerned can choose how to balance costs against quality, as the decision only affects calls between those connected to that private network.

    End-to-end delays under normal traffic conditions should preferably be kept under 150 ms. Delays under worst case conditions should always be kept under 400 ms.

    The type of voice encoding used is also decided by the customer. As an example,G.729 at 8 kbit/s is one of the more common types, as it allows more voice calls to be carried for the same number of bit/s. However, this is done at the expense of more processing time, thus adding to the delay. The encoding scheme and any other impairments need to be considered along with the expected delays in order to ensure that the overall voice service quality is satisfactory (assess using the E-model).

    Use of G.729 introduces 10 - 11 R-units of equipment impairments. This is approximately equivalent in effect to adding 100 ms of delay if ear-to-mouth delay is already greater than 150 ms.

  2. For national calls into the PSTN, most larger private networks are likely to offer a toll by-pass function. At the gateway with the PSTN, a call will preferably use 64 kbit/s G.711 encoding, rather than analogue. Our recommendation is that such calls should, wherever possible, be G.711 all the way from the source telephone (as for international). For economic reasons, low bit rate voice encoding may normally be used within the private network, but this would result in transcoding at the private network-PSTN interface. This should be avoided as it still incurs the delay and degradation due to the low bit rate encoding and packetisation.

    With the geographic distances in the main islands of New Zealand, even the longest call path is unlikely to cause more than 15-20 ms of path delay, so there is reasonable leeway for delay within the private network to keep within a target of 150 ms total delay and so avoid significantly worsening call quality.

    Another key issue is that the originator of the call is usually paying for that call and should be able to choose a different route if the lowest cost route is considered unsatisfactory.

  3. For international calls from the private network, the preferred approach is to pass the call to the PSTN at the closest possible point to the originating telephone, using G.711 encoding direct from the source if possible. This not only avoids delays within the private network, but also avoids the need for the network to carry the traffic, as international calls are the same price anywhere in the country and thus independent of the originating location.

  4. National calls into the private network are similar to case "b" above, but with one major difference - the caller on the PSTN is paying for the call and expects "normal toll quality".

    Telecom's preference is that such calls are answered at the "entry point" to the private network so that it is clear to a caller that the call was switched across the PSTN at the expected "normal toll quality". Despite Telecom's preference, more and more DDI is being used and incoming calls may be routed via the private network before they are answered. This raises concerns should the end-to-end call prove unacceptable to the caller.

  5. International calls into a private network are another potential problem. New Zealand is a signatory to the International Telecommunications Regulations, which require that international calls will be handled in accordance with ITU Recommendations. Such calls may already have been subjected to a long geographic delay and they are paid for by the overseas caller at a much higher rate than most toll calls.

Telecom prefers that all calls to and from the PSTN of whatever type are carried G.711 from telephone to telephone. The big issue here is that the source of an incoming call may be unknown, so it is important that delays and other impairments be kept to a practicable minimum for ALL incoming PSTN calls.

Prioritising Voice packets
Given that all CPE suppliers achieve optimum Loudness Ratings, get their level and loss plans compliant with our national transmission plan and design the private network correctly, there are various ways of ensuring that voice packets will get priority in a VoIP network. One of these is Multi Protocol Label Switching (MPLS), which will be implemented in Telecom's public IP network.

Nevertheless, ensuring "Quality of Service" is an end-to-end matter and it will, in time, be necessary for all networks carrying a call to agree on and work to the same or compatible protocols if success is to be assured.

Currently, there are various schemes in operation in private networks, but some of these are proprietary and certainly are not "end-to-end" at this early stage. More on this issue in future Newsletters.

Industry Comment
This is the longest Access Standards Newsletter ever, but the length is warranted by the importance of the subject matter. It is more a "tutorial" for those involved in VoIP private network planning and design than a true "newsletter", but it is also important that Telecom places a "stick in the ground" to explain what it expects of other network designers if New Zealand's voice telecommunications quality is not to be unnecessarily degraded.

Industry comment on the issues covered by this Newsletter and any suggestions for maintaining voice quality will be welcomed.






DOUG BURRUS
Manager
Access Standards




APPENDIX 1: LOUDNESS RATING IN AN "ALL-IP" OR "ALL DIGITAL" CALL



FIG. 1 TRANSMISSION PATHS AND PROCESSES IN AN IP PHONE (ETHERNET CONNECTION)

"Loudness Rating" is an objective performance measure for telephones, originally adopted by the ITU and more recently adopted by the EIA/TIA (the North American telecommunications standards organisations). Loudness Rating covers the electrical and acoustic conversions that occur in a telephone, and also includes the physiological characteristics of the human mouth and ear.

The Send Loudness Rating (SLR) is a measurement of the acoustic to electrical conversion efficiency of the handset microphone and associated analogue circuitry. It is measured by producing acoustic tones from a calibrated sound source (the artificial mouth) set at a fixed distance from the handset microphone, and measuring the analogue electrical output. The ratio Analogue Voltage/Acoustic input is calculated at 14 frequencies across the voice band (200 to 4000Hz). These ratios are weighted in accordance with the characteristics of the human ear and logarithmically averaged, the resulting SLR having the unit "dB". The optimum value for SLR in an IP or digital phone is 8dB.

The Receive Loudness Rating (RLR) is similarly calculated as a measurement of the electrical to acoustic efficiency of the handset earpiece and associated analogue circuitry, as detected by an artificial ear coupled to the handset receiver. The optimum value for RLR in an IP phone is 2dB.

Assuming there is no loss or gain in a wholly digital call carried over either an all-IP path, or a combination of IP and digital circuit-switched paths, and both phones have optimum Loudness Ratings, the Overall Loudness Rating (OLR) for the call is the sum of the SLR from one phone and the RLR of the other. The objective OLR i.e., SLR (8db) + RLR(2dB) = 10dB.

In the above, it is assumed that the D/A and A/D conversion process and coder/decoder process do not add gain or loss. Where the diagram shows "G.7xx Compressed Voice (digital)", this is not necessarily "compressed", as the preferred G.711 encoding is at the full 64 kbit/s rate. Nevertheless, it appears that many VoIP applications do use some form of compression.

What happens when the Loudness Ratings are NOT optimum?
An ideal for telephone performance is to simulate the equivalent of one person talking to another "face to face" in a normal speaking voice - typically at a distance of one metre. This is regarded as a "comfortable" conversation distance for most people.

In Mean Opinion Score tests, almost all users find the receive level "about right" if an OLR of 10 dB is achieved. This has been borne out by extensive user testing over many years.

When the OLR is not "right" human psychology comes in to the picture. If the listener hears very quiet speech in his/her telephone receiver, there is an unconscious automatic reaction to raise the voice when responding. If the other party was already hearing the received speech far too loud, the speaker raising his/her voice causes even more discomfort.

The opposite occurs if the received speech appears too loud. That user lowers his/her voice, possibly causing more difficulty for the other party. It is thus important that not only should the OLR be close to the optimum, but that this applies to both directions of transmission. Any significant difference between the two directions can impact quite seriously on overall conversation quality.

Echo can be another very significant factor. If one-way delay in the NZ network exceeds about 15 ms and no echo cancellation provided, perceived quality is reduced. The louder the send level, the greater the probability that the speaker will hear his/her own voice as an objectionable echo.

Like most transmission parameters, it is a question of balancing the various effects. Achieving optimum settings wherever possible is the best way of maintaining good voice quality.