Are you producing a Podcast and hosting it on your website? Your website is essentially a proprietary distribution platform. Sound familiar? Maybe similar in concept to a broadcast network?
Regarding vague perspectives in relation to whether the “Target Loudness” post production mindset is relevant, or not … hear me out.
Broadcast networks specify audio submission Integrated Loudness targets which include tolerance margins. If an audio submission does not meet the specified requirement(s) – the work is rejected.
In essence networks expect the submitter to properly (let’s say) manipulate prepared works in order to meet requirements prior to submission.
Conversely most music streaming services handle this so called manipulation internally using proprietary methods. They apply perceptual loudness manipulation across submissions in order to establish playback consistency.
For example if -14.0 LUFS is the recognized distribution Integrated Loudness for an arbitrary music streaming service and your mastered music submission checks in at -10 LUFS … the service will subtract 4 LU of gain.
Note if the above scenario is reversed I’m not entirely sure if adding gain is now commonplace. I’ve heard this practice is not widespread. However I do believe select streaming services add gain (and possibly limiting) if necessary.
BTW Loudness Normalization in concept is nothing more than adding/subtracting gain in order to meet a specified target. If added gain causes spec. defined True Peak overshoots – limiting may be applied.
Many music mastering engineers recommend producers simply ignore the loudness target concept. They widely suggest mastering for optimum fidelity and present streaming services with a well produced product that may be efficiently manipulated according to the service’s requirements. All good.
I don’t have access to valid data specifying whether ubiquitous streaming services currently manipulate spoken word Podcasts using the same methods applied to music submissions. I’ll look into it.
* * *
Back to hosting your Podcast on your personal website, or in essence – your platform …
Efficient website accessibility for your Podcast is an essential requirement. My guess is your implemented site player does not manipulate the attributes of your embedded files in order to standardize distribution Integrated Loudness across your hosted catalogue. And I doubt independent producers at large hire coders to build server side audio processing engines to establish what I previously described.
Remember, you – the site owner, producer, whatever – bear the responsibility to serve your listeners with let’s call it optimized audio that is perceptually consistent across all of your hosted programs. Your target may be subjective or it may adhere to published best practices. Again, all good.
Point is – without a recognized Integrated Loudness target including acceptable tolerance margins (and an ultimate True Peak ceiling) – any standardization concept would be near impossible to efficiently implement.
How to Do It
You can certainly attempt to “mix” your programs in RT using a loudness meter thus adhering to various descriptors. However final stage off-line target processing is much more efficient.
Of course the quality of your intermediate and/or pre-master prior to Loudness Normalization will dictate final fidelity and speech intelligibility of the processed output.
Let’s not marginalize the significance of target based audio processing and Loudness Normalization with full True Peak compliance. The general concept works for proprietary broadcast platforms and it is certainly applicable for your personal website where you host your spoken word Podcast.
I’ve heard a few savvy people refer to the LRA (Loudness Range) descriptor as inherent Dynamic Range. This reference is for the most part inaccurate.
LRA is a threshold gated statistical representation of measured Loudness or variations as such over time. Incorporated Absolute and Relative gating prevents potentially skewed measurements that may result when the passing audio includes sudden instances of impactful amplitude (e.g. gun shots, explosions, etc.) and/or extended periods of silence.
Correlation certainly exists between inherent LRA and Dynamics. In fact – in order to optimize audio for a particular delivery platform, an accurately measured LRA may indicate whether further dynamics manipulation across a segment of audio may be necessary .
PSR and PLR
It is commonplace to acknowledge PSR (Peak to Short Term Loudness Ratio) and PLR (Peak to Loudness Ratio) as accurate indicators of audio dynamics.
PSR is the differential between the measured (ungated) Short Term Loudness and the max. True Peak ceiling. The duration of the averaging window (3 sec.) and the resulting Short Term Loudness measurement relative to the maximum True Peak reflects a near real time representation of playback audio dynamics. High relative PSR values suggest wide dynamics. Conversely low relative PSR values suggest reduced dynamics, excessive limiting, and elevated perceived loudness.
E.g. RT Short Term Loudness: -12 LUFS. Max True Peak -2.0. PSR = 10. As the Short Term Loudness elevates, the differential between it and the True Peak max. decreases thus indicating reduced RT dynamics.
PLR is the differential between measured (gated) Integrated Loudness of (in most cases) an entire audio segment from start to stop and the max True Peak ceiling. In essence PLR represents a long term gated average of inherent dynamics over time.
E.g. measured Integrated Loudness: -16 LUFS. Max True Peak -2.0. PLR = 14. In comparison – if the measured Integrated Loudness checked in at -12 LUFS, the PLR would shift to 10, thus indicating reduced global dynamics and elevated loudness.
LRA vs. Dynamic Range
As far as this vague reference to LRA indicating Dynamic range – consider the following:
A hypothetical mastered (spoken word) Podcast checks in at -16 LUFS with a -2 dBTP max. The inherent PSR (measured in RT using a Loudness Meter) = approx. 10. The PLR = 14, and the measured LRA = 4 LU.
In this example – it is obvious the LRA (4 LU) does not reflect the theoretical dynamic range of the piece. In fact the PSR is the suitable indicator of RT audio dynamics. The PLR represents the global dynamics over the entire duration of the audio segment.
The LRA descriptor is an algorithmic calculation incorporating gated thresholds. It does not indicate the measured Dynamic Range of a piece of audio. However it is certainly a viable indicator representing the statistical variation of measured Loudness over time.
An elevated spoken word LRA (> 7 LU) may indicate compromised intelligibility, and as noted – the necessity for further DSP processing and re-mastering.
For RT measurement of inherent audio dynamics, use a supported tool to display the running PSR and PLR values. There are various third party options available, such as Dynameter by MeterPlugs, MasterCheck by Nugen Audio, and the Youlean Loudness Meter.
For a (stereo) -16.0 LUFS spoken word Podcast – PLR 15/14 is optimal. Corresponding PSR values will vary based on the attributes of applied dynamics processing.
Incidentally – if you are producing Podcasts professionally, you need to learn how to use a Loudness Meter. It is an essential tool, providing a broad scope of RT descriptors, such as Loudness, LRA, Dynamic Range, and True Peak. A number of meters support offline measurements within certain DAW environments.
Scores of audio producers in the Podcast Production space have adopted an inaccurate term when referring to basic Loudness Normalization: Loudness Leveling.
First – what is Loudness Normalization? Actually, it’s quite simple:
Audio is measured in it’s entirety. The existing Integrated (Program) Loudness is determined. A gain offset is applied relative to a spec. based or subjective Integrated Loudness target.
For example: if the source audio measures -20 LUFS, and the Loudness Target is -16 LUFS, +4 LU of gain will be applied.
As well, a True Peak Max. Ceiling is defined, which again may be spec. based or subjective. If the required Integrated Loudness gain offset results in overshoots – limiting is applied in order to maintain compliance.
It’s important to note that Loudness Normalization does not correct wide variations in audio levels. As well – it does not guarantee optimized intelligibility for spoken word. If an audio piece (e.g. multiple participant segment) contains inconsistencies as such, the Loudness Normalization gain offset will simply elevate (or reduce) the relative perceptual loudness of the audio. The original dynamic attributes will persist.
That’s it. There’s nothing more to it unless the Loudness Normalization tool features some sort of dynamics optimization process that may or may not be active.
For the record – the Loudness Module included in iZotope’s RX 7 Advanced Audio Editor applies basic Loudness Normalization (measurement, gain, and limiting). It does not apply optimization processing.
View this source clip waveform. There are two participants with noticeable level inconsistencies:
This is the same clip Loudness Normalized (to -19.0 LUFS). The perceptual loudness is higher. However the level inconsistencies persist:
Leveling is a process that addresses and corrects noted inconsistencies and level variations. It is accomplished by the use of gain riding plugins and/or specialty tools that rely on complex algorithms. One basic example is the use of an “RMS” Compressor featuring an optimal and often extended release time parameter.
This is a “leveled” version of the original source clip displayed above. The previously persistent level inconsistencies no longer exist.
Finally, this is the leveled audio, Loudness Normalized to -19.0 LUFS. The described processes were in fact discrete.
I hope I’ve made it clear that the term Loudness Leveling is not an accurate term to describe Loudness Normalization. The key is that Loudness Normalization is gain and limiting. It does not correct inconsistent level variations. You’ll need to implement discrete Leveling processes to address any persistent inconsistencies.
I was just reading Chris Curran’sDaily Goody segment, published today. The piece is titled Balancing the Levels of All Voices. Chris explains the importance of consistent dialogue levels across multiple participants, and shares various methods to achieve this.
Chris states in his second tip:
>>>“Another way to quickly balance the levels of various participants is to process each participants track to be the same LUFS level. This will make them close to level, but you will always want to adjust the levels slightly using your ears. Because even when the LUFS level of two different voices is the same, the perceived loudness of each voice can differ due to things like proximity to the mic, dynamic range, frequency response of the mic, the timbre of individual voices, etc. So it’s a handy practice to set the LUFS level of each participant to the same value, but then you still have to use your ears.”<<<
Good advise IMHO. Here’s my perspective …
The term LUFS Level is a generalization. It requires clarification.
There are 3 notable measurement descriptors that indicate perceptual Loudness in LUFS/LKFS (or LU’s when using a relative scale):
• Integrated Loudness (also referred to as Program Loudness)
• Short Term Loudness
• Momentary Loudness
Their distinguishing attributes are distinct time and/or averaging intervals: Integrated (cumulative measurement from start to finish), Short Term (3 sec.), and Momentary (400ms). It’s important to recognize the significance of each descriptor.
As well, (and Chris alludes to this in his piece) – you must recognize how a consistent Integrated Loudness measurement across multiple spoken word segments (or session participants) does not necessarily guarantee suitable matched level perception and/or optimized intelligibility.
Remember – Integrated Loudness represents a cumulative measurement from start to finish. For 100% accuracy – the piece must be measured in it’s entirety. Also, the descriptor does not reflect inherent dynamic attributes and/or inconsistencies that my in turn marginalize attempts to optimize perception.
With this in mind, if you choose to use Integrated Loudness as a perceptual Loudness matching indicator – audio optimization (compression, etc.) and target accuracy must be applied and established before relying on any common Integrated Loudness measurement.
What about Short Term/Momentary Loudness?
The 3 sec. averaging interval of the Short Term Loudness descriptor indicates an active, foreground measurement. It is highly useful when analyzing the loudness consistency of spoken word/dialogue. Momentary Loudness will provide even finer “detail” – once again due to it’s inherent averaging interval (400ms).
To summerize: “LUFS Level” is a generalization. As noted there are 3 descriptors (Integrated, Short Term, Momentary). Short Term and Momentary Loudness are useful indicators for the establishment of spoken word consistency. Learn how to use a Loudness Meter (online or offline) to closely monitor each descriptor.
With regards to Loudness Normalization – some processing tools such as RX Loudness Control by iZotope (AAX/Pro Tools only) support user defined Short Term and Momentary Loudness targeting within a certain tolerance range.
These options, along with the ubiquitous Integrated Loudness definition (and of course subjective audio processing) should provide everything you need in your quest to achieve optimized dialogue.
LevelView by Grimm Audio is a highly functional and well designed real time Loudness Meter.
Here are the details:
LevelView features a unique multifaceted Rainbow Meter. Clicking the Rainbow display toggles the Meter scale (EBU +9 or EBU +18).
There are three compliance modes: EBU R128, ATSC A/85, and a custom User specification (Gated or Ungated). The Rainbow Meter displays a Relative Scale. Consequentially the defined target will be equivalent to 0 LU.
The upper blue Rainbow arc represents Short Term Loudness measured within a 3 sec. time frame. The inward blue arcs indicate slower time frame variances (10, 30, 90, and 270 seconds).
The arced needle meter located above the Rainbow Meter represents the Momentary Loudness measured within a 400ms time frame.
Visual dots displayed (and held) on both the Momentary and Short Term Loudness indictor plots represent the maximum values for each descriptor. Both indicators will shift to orange when their values exceed recognized guidelines (+8 max M, and +6 Max S).
The numerical descriptor table features a large Integrated Loudness value. This may display an Absolute Scale value in LUFS, or a Relative Scale value in LU’s. Clicking the descriptor text toggles it’s view.
Additional numerical descriptors include maximum Momentary Loudness (max M), maximum Short Term Loudness (max S), LRA (Loudness Range), PLR (Peak to Loudness Ratio), and maximum True Peak (max TP). Clicking the max TP descriptor text will toggle the measurement algorithm and display max TP or max SP (Sample Peak). Descriptors will shift to orange when a displayed value exceeds recognized or specification guidelines.
The graph located at the lower left is the Loudness Range histogram. It displays the distribution of the measured Loudness over time. The data will indicate whether further dynamic range compression may be necessary.
LevelView supports Manual start and stop measurements. Setting the meter to Auto will force it to follow the host DAW’s transport. In essence the meter will automatically start/stop and reset based on the status of the transport.
Link mode records and stores data continuously. This allows the operator to revert back in time and re-measure a passage without resetting the stored measurements. In the event a passage is skipped, a gap warning will appear in orange. Re-measurement of a skipped segment will clear the gap warning. The Stop button resets the memory. Note the LevelView documentation indicates that the host “must provide time code for the Link function to work.”
It is possible to run various connected (Host and Client) instances of LevelView on a network or over the Internet. I will be testing these options in the near future.
LevelView is available as an AU, VST, or AAX Plugin. The AU and VST versions support (5.1) Surround Sound measurement. The meter conforms to the SMPTE/ITU channel matrix standard (L-R-C-LFE-Ls-Rs).
The meter may also run in a stand-alone mode with no DAW dependency. I/O configuration options are provided.
I like this meter and I appreciate it’s unique design and accuracy. The networking options, support for Surround Sound, and stand-alone capability make it highly flexible and well worth it’s reasonable cost ($70 U.S. at Don’tCrack). I’m happy to recommend it.
Improvements I’d like to see:
– Scaleable UI
– Option to define a custom Maximum True Peak in the User mode (currently it defaults to -1.0 dBTP)
The attached image displays a processing workflow designed to optimize Spoken Word intelligibility. The workflow also demonstrates a realtime example of Integrated Loudness compliance targeting.
There are 7 reference point Sections worth noting:
Section A includes the Adobe Audition Effects Rack Signal Level Meters indicating the source (Input) level and the (Output) level. The Output level reflects the results of the workflow’s inserted plugins. The chain includes a Compressor, a Limiter, and a Loudness Meter. Note the level meters indicate signal level. They do not indicate or represent perceptual Loudness.
Section B displays the gain reduction applied by the Compressor at the current position of the playhead. For the test/source audio I determined an average of 6dB of gain reduction would yield acceptable results. The purpose of this stage is to reduce the dynamic range and/or dynamic structure of the Spoken Word resulting in optimized intelligibility AND to prevent excessive down stream limiting. This is an important workflow element when preparing Spoken Word audio for Internet/Mobile, and Podcast distribution.
Section C includes my subjective limiting parameters. The Limiter will add the required amount of gain to achieve a -16.0 LUFS deliverable while adhering to a -1.5 dBTP (True Peak Max). If the client, platform, or workflow requires an alternative Loudness target and/or Maximum True Peak ceiling – the parameters and their mathematical relationship may be altered for customized targeting. Please note the Maximum True Peak referenced in any spec. is more of a ceiling as opposed to a target. In essence the measured signal level may be lower than the specified maximum.
Section D indicates the amount of limiting that is occurring at the current position of the playhead.
Section E displays the user defined Integrated Loudness target located above the circular Momentary Loudness LED (12 o’clock position). The defined Integrated Loudness target is also visually represented by the Radar’s second concentric circle. The Radar display indicates the Short Term Loudness measured over time within a 3 sec. window. The consistency of the Short Term Loudness is evident indicating optimized intelligibility.
Section F displays the unprocessed source audio that lacks optimization for Internet/Mobile, and Podcast distribution. Any attempt to consume the audio in it’s current state in a less than ideal listening environment will result in compromised intelligibility. Mobile device consumption in like environments will exacerbate compromised intelligibility.
Section G displays the processed/optimized audio suitable for the noted distribution platform. The Integrated Loudness, True Peak, and LRA descriptors now satisfy compliance targets. Notice there is no indication of excessive limiting.
I thought I’d revisit various aspects of Loudness Meter Absolute/Relative Scale correlation, and provide a visual representation of a real time processing Session with both Scales active.
Descriptors and Scales
Modern Loudness Meters display various descriptors including Program Loudness – also referred to as Integrated Loudness. There are two scales that can be used to display measured Program or Integrated Loudness over time …
The most common is an Absolute Scale, displayed in LUFS or LKFS. LUFS refers to Loudness Units relative to Full Scale. LKFS refers to Loudness Units K-Weighted relative to Full Scale. There is no difference in the perceptual measured loudness between both descriptor references.
It is also possible to measure and display Integrated/Program Loudness as Loudness Units (or LU’s) on a Relative Scale where 1LU == 1 dB.
When shifting to a Relative Scale, the 0 LU increment is always equivalent to the Meter’s user defined or spec. defined Absolute Loudness target.
For example, in an R128 -23.0 LUFS Absolute Scale workflow, setting the Meter to display a Relative Scale changes the target to 0 LU.
So – if a piece of measured audio checks in at -23.0 LUFS on an Absolute Scale, it would be perceptually equal to measured audio checking in at 0 LU on a Relative Scale.
Likewise if the Meter’s Absolute Scale target is set to -16.0 LUFS, it will correlate to 0 LU on a Relative Scale. Again both would reflect perceptual equivalence.
All broadcast delivery specifications suggest Absolute Scale Integrated Loudness targets. However, for any number of subjective reasons – many operators prefer to use the alternative Relative Scale and “mix or master to 0 LU.”
Please note Loudness Units are also the proper way in which to describe Loudness differentials between two programs. For instance, “Program (A) is +2 LU louder than Program (B).” One might also describe gain offsets in LU’s as opposed to dB’s.
Hornet Plugins recently released Hornet LU Meter. This tool is a Loudness Meter plugin designed to measure and display Integrated/Program Loudness within a 400ms time window. This measurement represents the Momentary Loudness descriptor.
The Meter is indeed nifty and affordable. However there is one sort of caveat worth noting: As the name suggests, it is an LU Meter. In essence Integrated (Momentary) Loudness measurements are solely displayed on a Relative Scale.
The displayed Session (image) consists of a single mono VO clip. The objective is to print a processed stereo version in RT checking in at -16.0 LUFS with a maximum True Peak no higher than -2.0 dBTP.
The output of the mono VO track is routed to a mono Auxiliary Input track titled Normalize. If you are not familiar with Pro Tools, an Auxiliary Input track is not the same as an Auxiliary Send. Auxiliary Input tracks allow the user to pass signal using buses, insert plugins, and adjust level. They are commonly used to create sub-mixes.
I’ve inserted a Compressor and a Limiter on the Normalize Auxiliary Input track. The processed audio is passing through at -19.0 LUFS (mono).
The audio is then routed to a second (now stereo) Auxiliary Input track titled Offset. I use the track fader to apply a +3 dB gain offset, This will reconstitute the loss of gain that occurs on center panned mono tracks. The attenuation is a direct result of the Pro Tools Pan Depth setting.
The signal flow/output is now passing -16.0 LUFS audio. It is routed to a standard audio track titled Print. When this track is armed to record, it is possible to initiate a realtime bounce of the processed/routed audio.
Notice the instances of the Hornet LU Meter and TC Electronics Loudness Radar. Both Meters are inserted on the Master Bus and are measuring the session’s Master Output.
I set the Reference (target) on the Hornet LU Meter to -16.0 LUFS. In essence 0 LU on it’s Relative Scale represents -16.0 LUFS.
Conversely the TC Electronic Meter is configured to display Absolute Scale measurements. The circular LED that borders the Radar area indicates Momentary Loudness. The defined Integrated Loudness target is displayed under the arrow at the 12 o’clock position.
Remember the Hornet LU Meter solely displays Momentary Loudness. If you compare it’s current reading to the indication of Momentary Loudness on the TC Electronic Meter, the relationship between Relative Scale and Absolute Scale measurement is clearly indicated. Basically the Hornet Meter registers just below 0 LU. The TC Electronic Meter registers just below -16.0 LUFS.
I will say if you are comfortable monitoring real time Momentary Loudness and understand Relative/Absolute Scale correlation, the Hornet tool is quite useful. In fact it contains additional features such as Grouping, auto/manual Gain Compensation, and auto-Maximum Peak protection.
Additional insight on the K-weighting Curve or K-weighted filtering:
K-weighting suggests de-emphasized low frequencies by way of a high-pass filter. A high-shelving filter is applied to the upper frequency range, and the measured data is averaged.
TC Electronic describes applied K-weighting on audio channels as a “method to build a bridge between subjective impression and objective measurement.”
This is a re-post of an article that I published in October, 2015 …
In a recent Midroll article titled “Why Programmatic Ads Aren’t Necessarily Great for Podcasting,” the staff writer states:
“A number of players in the Podcasting and advertising industries are making bets on programmatic Ad delivery — dynamically inserting Ads into a Podcast as the episode is downloaded. It’s an understandable temptation, but we at Midroll see some tradeoffs.”
I wonder how networks will handle potential perceived Loudness inconsistencies between produced Ads and new or preexisting programs?
I’ve mentioned my past affiliation with IT Conversations and The Conversations Network, where I was the lead post audio engineer from 2005-2012. Executive Director Doug Kaye built a proprietary content management system and infrastructure that included an automated component based Show Assembly System. Audio components were essentially audio clips (Intros, Outros, Ads, Credits. etc.) combined server side into Podcasts in preparation for distribution.
One key element in this implementation was the establishment of perceived Loudness consistency across all submitted audio components. This was accomplished by standardizing an average Loudness Target using a proprietary software RMS Normalizer to process all server side audio components prior to assembly. (Loudness Normalization is now the recommended process for Integrated Loudness targeting and consistency).
Due to this consistency, all distributed Podcasts were perceptually equal with regard to Integrated or Program Loudness upon playback. This was for the benefit of the listener, removing the potential need to make constant playback volume adjustments within a single program and throughout all programs distributed on the network.
Regarding Programmatic Ad insertion, I have yet to come across a Podcast Network that clearly states a set Integrated Loudness Target for submitted programs. (A Maximum True Peak requirement is equally important. However this descriptor has no effect on perceptual Loudness consistency).
Due to the absence of any suggested internal network guidelines or any form of standardized Loudness Normalization, dynamic Ad insertion has the potential to ruin the perceptual consistency within single programs and throughout the contents of an entire network.
Many conscientious independent producers have embraced the credible -16.0 LUFS Integrated Loudness Target for stereo Internet/ Mobile/Podcast audio distribution (the perceptual equivalent for mono distribution is -19.0 LUFS). It’s far from a requirement, and nothing more than a suggested guideline.
My hope is Podcast Networks will begin to recognize the advantages of standardization and consider the adoption of the -16.0 LUFS Integrated Loudness Target. Dynamically inserted Ads must be perceptually equal to the parent program. Without a standardized and pre-disclosed Integrated Loudness Target, it will be near impossible to establish any level of distribution consistency.
I recently analyzed a few of the internal Podcasts produced by CNN. One particular installment is yet another example of a major media outlet distributing audio that is in my view unsuitable for this particular platform.
Let’s discuss file attributes and measured specs. for one of CNN’s distributed Podcasts:
The distributed audio is mono, 64kbps, with music elements. I’ve stated how I feel about this. I’m not a proponent of 64 kbps MP3 audio PERIOD (mono or stereo). In general audio in this format sounds horrible. Feel free to disagree.
Secondly, the Integrated (Program) Loudness for this particular program is just about -23.0 LUFS with a Maximum True Peak of +0.40 dBTP. From my perspective the perceptual Loudness misses the mark. And, the audio is clipped.
Lastly, the produced audio is way too dynamic for spoken word. The perceptual inconsistency of the delivery by the participants is inadequate when considering how (for the most part) this program will be consumed (mobile devices, problematic ambient spaces, etc.).
I decided to sort of showcase this particular program because it is a good candidate for flexible Target considerations. What do I mean by “flexible Target considerations?” Let me explain …
Again, the distributed file is mono. The recommended Integrated Loudness Target for mono Podcasts is -19.0 LUFS. This is the perceptual equivalent of -16.0 LUFS stereo. If I were to apply a +4 db gain offset to Loudness Normalize this audio to -19.0 LUFS, there would be very little change in the original dynamic structure of the audio. However without some form of aggressive limiting, the maximum amplitude or Peak Ceiling would be driven into oblivion. In fact audible distortion may occur with or without limiting. This is obviously not recommended.
There are two options to consider: 1) apply Dynamic Range Compression before Loudness Normalization, or 2) shoot for a lower Integrated Loudness target. For this particular example I chose to implement both options.
First, in my view optimizing the dynamics in this program for Podcast distribution is unavoidable. It’s just way too choppy and it lacks delivery consistency for spoken word. Also, by lowering the L.Normalized Target, the necessary added gain offset will be reduced resulting in less aggressive limiting. In addition, the reduced amount of added gain will curtail noise floor elevation and other variables such as exaggerated breaths.
As noted the distributed Podcast (displayed in the attached upper waveform example) checks in at -23.0 LUFS and it is clipped. My optimized version (displayed in the lower waveform example) checks in at -20.2 LUFS with a Maximum True Peak of -1.23 dBTP. It is well within a reasonable level of Program Loudness tolerance for Podcast L.Normalization. In fact the perceptual difference between the processed -20.0 LUFS audio and a -19.0 LUFS version would be pretty much undetectable. In essence the audio has been optimized and it exhibits improved intelligibility. It is now well suited for Podcast distribution.
(If you are interested in the tools that I use, they are listed under Available Services).
It is no secret that I am a staunch proponent of the -16.0 LUFS/-19.0 LUFS recommendations for Podcasts. However, in certain situations – tolerance for slightly reduced Program Loudness Targets is acceptable.
For the record – my remaster is much easier to listen to. CNN can do better.
Consider this: Two extended segments of audio, Loudness Normalized (or mixed in real time) to the same Integrated Loudness Target.
Segment (A) is fairly consistent, with a very limited amount of intermittent silence gaps.
Segment (B) is far less consistent, due to a multitude of intermittent silence gaps.
When passing both segments through a Loudness Meter (or measuring the segments offline), and recognizing Integrated Loudness is a reflection of the average perceptual Loudness of an entire segment – how will inherent silence affect the accuracy of the cumulative measurements?
In theory the silence gaps in Segment (B) should affect the overall measurement by returning a lower representation of average Integrated Loudness. If additional gain is added to compensate, Segment (B) would be perceptually louder than Segment (A).
Basically without some sort of active measurement threshold, the algorithms would factor in silence gaps and return an inaccurate representation of Integrated Loudness.
In order to establish perceptual accuracy, silence gaps must be removed from active measurements. Loudness Meters and their algorithms are designed to ignore silence gaps. The omission of silence is based on the relationship between the average signal level and a predefined threshold.
Loudness Meter (G10) Gate
The specification Gate (G10) is an aspect of the ITU Loudness Measurement algorithms included in compliant Loudness Meters. It’s function is to temporarily pause Loudness measurements when the signal drops below a relative threshold, thus allowing only prominent foreground sound to be measured.
The relative threshold is -10 LU below ungated LUFS. Momentary and Short Term measurements are not gated. There is also a -70 LUFS Absolute Gate that will force metering to ignore extreme low level noise.
Most Loudness Meters reveal a visual indication of active gating (see attached image) and confirm the accuracy of displayed measurements.
Additional “Gate” Generalizations and Nomenclature
A Downward Expander and it’s applied attenuation is dependent on signal level when the signal drops below a user defined threshold. The Ratio dictates the amount of attenuation. Alternatively a Noise Gate functions independent of signal level. When the level drops below the defined threshold, hard muting is applied.
This is a somewhat proprietary term. It is a parameter setting available on the Aphex 320A and 320D Compellor hardware Leveler/Compressor.
When a passing signal level drops below the user defined Silence Gate threshold for 1 second or longer, the device’s VCA (Voltage Controlled Amplifier) gain is frozen. The Silence Gate will prevent the Leveling and Compression processing from releasing and inadvertently increasing the audibility of background noise.
The document is a collection of Loudness processing guidelines for diverse platform dependent media streaming and downloading. This would include music, spoken word, and possible high dynamic audio in video streams. The document credits some of the most well respected industry leading professionals, including Bob Katz, Thomas Lund, and Florian Camerer. The term “Podcast” is directly referenced once in the document, where the author(s) state:
“Network file playback is on-demand download of complete programs from the network, such as podcasts.”
I support the purpose of this document, and I understand the stated recommendations will most likely evolve. However in my view the guidelines have the potential to create a fair amount of confusion for producers of spoken word content, mainly Podcast producers. I’m specifically referring to the suggested 4 LU range (-16.0 to -20.0 LUFS) of acceptable Integrated Loudness Targets and the solutions for proper targeting.
Indeed compliance within this range will moderately curtail perceptual loudness disparities across a wide range of programs. However the leniency of this range is what concerns me.
I am all for what I refer to as reasonable deviation or “wiggle room” in regard to Integrated Loudness Target flexibility for Podcasts. However IMHO a -20 LUFS spoken word Podcast approaches the broadcast Loudness Targets that I feel are inadequate for this particular platform. A comparable audio segment with wide dynamics will complicate matters further.
I also question the notion (as stated in the document) of purposely precipitating clipping when adding gain “to handle excessive peaks.”
And there is no mention of the perceptual disparities between Mono and Stereo files Loudness Normalized to the same Integrated Loudness Target. For the record I don’t support mono file distribution. However this file format is prevalent in the space.
I feel the document’s perspective is somewhat slanted towards platform dependent music streaming and preservation of musical dynamics. In this category, broad guidelines are for the most part acceptable. This is due to the wide range of production techniques and delivery methods used on a per musical genre basis. Conversely spoken word driven audio is not nearly as artistically diverse. Considering how and where most Podcasts are consumed, intelligibility is imperative. In my view they require much more stringent guidelines.
It’s important to note streaming services and radio stations have the capability to implement global Loudness Normalization. This frees content creators from any compliance responsibilities. All submitted media will be adjusted accordingly (turned up or turned down) in order to meet the intended distribution Target(s). This will result in consistency across the noted platform.
Unfortunately this is not the case in the now ubiquitous Podcasting space. At the time of this writing I am not aware of a single Podcast Network that (A) implements global Loudness Normalization … and/or … (B) specifies a requirement for Integrated Loudness and Maximum True Peak Targets for submitted media.
Currently Podcast Loudness compliance Targets are resolved by each individual producer. This is the root cause of wide perceptual loudness disparities across all programs in the space. In my view suggesting a diverse range of acceptable Targets especially for spoken word may further impede any attempts to establish consistency and standardization.
PLR and Retention of Music Dynamics
The document states: “Users may choose a Target Loudness that is lower than the -16.0 LUFS maximum, e.g., -18.0 LUFS, to better suit the dynamic characteristics of the program. The lower Target Loudness helps improve sound quality by permitting the programs to have a higher Peak to Loudness Ratio (PLR) without excessive peak limiting.”
The PLR correlates with headroom and dynamic range. It is the difference between the average Loudness and maximum amplitude. For example a piece of audio Loudness Normalized to -16.0 LUFS with a Maximum True Peak of -1 dBTP reveals a PLR of 15. As the Integrated Loudness Target is lowered, the PLR increases indicating additional headroom and wider dynamics.
In essence low Integrated Loudness Targets will help preserve dynamic range and natural fidelity. This approach is great for music production and streaming, and I support it. However in my view this may not be a viable solution for spoken word distribution, especially considering potential device gain deficiencies and ubiquitous consumption habits carried out in problematic environments. In fact in this particular scenario a moderately reduced dynamic range will improve spoken word intelligibility.
Recommended Processing Options and Limiting
If a piece of audio is measured in it’s entirety and the Integrated Loudness is higher than the intended Target, a subtractive gain offset normalizes the audio. For example if the audio checks in at -18.0 LUFS and you are targeting -20.0 LUFS, we simply subtract 2 dB of gain to meet compliance.
Conversely when the measured Integrated Loudness is lower than the intended Target, Loudness Normalization is much more complex. For example if the audio checks in at -20.0 LUFS, and the Integrated Loudness Target is -16.0 LUFS, a significant amount of gain must be added. In doing so the additional gain may very well cause overshoots, not only above the Maximum True Peak Target, but well above 0dBFS. Inevitably clipping will occur. From my perspective this would clearly indicate the audio needs to be remixed or remastered prior to Loudness Normalization.
Under these circumstances I would be inclined to reestablish headroom by applying dynamic range compression. This approach will certainly curtail the need for aggressive limiting. As stated the reduced dynamic range may also improve spoken word intelligibility. I’m certainly not suggesting aggressive hyper-compression. The amount of dynamic range reduction is of course subjective. Let me also stress this technique may not be suitable for certain types of music.
Additional Document Recommendations and Efficiency
The authors of the document go on to share some very interesting suggestions in regard to effective Loudness Normalization:
1) “If level has to be raised, raise until it reaches Target level or until True Peak reaches 0 dBTP, whichever occurs first. Thus, the sound quality will be preserved, without introducing excessive peak limiting.”
2) “Perform what is noted in example 1, but keep raising the level until the program level reaches Target, and apply either peak limiting or allow some clipping to handle excessive peaks. The advantage is more consistent loudness in the stream, but this is a potential sonic compromise compared to example 1. The best way to retain sound quality and have more consistent loudness is by applying example 1 and implementing a lower Target.”
With these points in mind, please review/demo the following spoken word audio segment. In my opinion the audio in it’s current state is not optimized for Podcast distribution. It’s simply too low in terms of perceptual loudness and too dynamic for effective Loudness Normalization, especially if targeting -16.0 LUFS. Due to these attributes suggestion 1 above is clearly not an option. In fact neither is option 2. There is simply no available headroom to effectively add gain without driving the level well above full scale. Peak limiting is unavoidable.
I feel the document suggestions for the segment above are simply not viable, especially in my world where I will continue to recommend -16.0 LUFS as the recommended Target for spoken word Podcasts. Targeting -18.0 LUFS as opposed to -16.0 LUFS is certainly an option. It’s clear peak limiting will still be necessary.
Below is the same audio segment with dynamic range compression applied before Loudness Normalization to -16.0 LUFS. Notice there is no indication of aggressive limiting, even with a Maximum True Peak of -1.7 dBTP.
Regarding peak limiting the referenced document includes a few considerations. For example: “Instead of deciding on 2 dB of peak limiting, a combination of a -1 dBTP peak limiter threshold with an overall attenuation of 1 dB from the previously chosen Target may produce a more desirable result.”
This modification is adequate. However the general concept continues to suggest the acceptance of flexible Targets for spoken word. This may impede perceptual consistency across multiple programs within a given network.
The flexible best practices suggested in the AES document are 100% valid for music producers and diverse distribution platforms. However in my opinion this level of flexibility may not be well suited for spoken word audio processing and distribution.
I’m willing to support the curtailment of heavy peak limiting when attempting to normalize spoken word audio (especially to -16.0 LUFS) by slightly reducing the intended Integrated Loudness Target … but not by much. I will only consider doing so if and when my personal optimization methods prior to normalization yield unsatisfactory results.
My recommendation for Podcast producers would be to continue to target -16.0 LUFS for stereo files and -19.0 LUFS for mono files. If heavy limiting occurs, consider remixing or remastering with reduced dynamics. If optimization is unsuccessful, consider lowering the intended Integrated Loudness Target by no more than 2 LU.
A True Peak Maximum of <= -1.0 dBTP is fine. I will continue to suggest -1.5 dBTP for lossless files prior to lossy encoding. This will help ensure compliance in encoded lossy files. What’s crucial here is a full understanding of how lossy, low bit rate coders will overshoot peaks. This is relevant due to the ubiquitous (and not necessarily recommended) use of 64kbps for mono Podcast audio files.
Let me finish by stating the observations and recommendations expressed in this article reflect my own personal subjective opinions based on 11 years of experience working with spoken word audio distributed on the Internet and Mobile platforms. Please fell free to draw your own conclusions and implement the techniques that work best for you.
I’ve discussed the reasons why there is a need for revised (optimized) Loudness Standards for Internet and Mobile audio distribution. Problematic (noisy) consumption environments and possible device gain deficiencies justify an elevated Integrated Loudness target. Highly dynamic audio complicates matters further.
In essence audio for the Internet/Mobile platform must be perceptually louder on average compared to audio targeted for Broadcast. The audio must also exhibit carefully constrained dynamics in order to maintain optimized intelligibility.
The recommended Integrated Loudness targets for Internet and Mobile audio are -16.0 LUFS for stereo files and -19.0 LUFS for mono. They are perceptually equal.
In terms of Dynamics, I’ve expressed my opinion regarding compression. In my view spoken word audio intelligibility will be improved after careful Dynamic Range Compression is applied. Note that I do not advocate aggressive compression that may result in excessive loudness and possible quality degradation. The process is a subjective art. It takes practice with accessibility to well designed tools along with a full understanding of all settings.
I thought I would discuss various aspects of Podcast audio Dynamics. Mainly, the potential problematic significance of wide Dynamics and how to quantify aspects as such using various descriptors and measurement tools. I will also discuss the benefits of Dynamic Range management as a precursor to Loudness Normalization. Lastly I will disclose recommended benchmarks that are certainly not requirements. Feel free to draw your own conclusions and target what works best for you.
Highly Dynamic Audio in Noisy Environments
At it’s core extended or Wide Dynamic Range describes notable disparities between high and low level passages throughout a piece of audio. When this is prevalent in a spoken word segment, intelligibility will be compromised – especially in situations where the listening environment is less than ideal.
For example if you are traveling below Manhattan on a noisy subway, and a Podcast talent’s delivery is inconsistent, you may need to make realtime playback volume adjustments to compensate for any inconsistent high and low level passages.
As well – if the Integrated Loudness is below what is recommended, the listening device may be incapable of applying sufficient gain. Dynamic Range Compression will reestablish intelligibility.
From a post perspective – carefully constrained dynamics will provide additional headroom. This will optimize audio for further down stream processing and ultimately efficient Loudness Normalization.
Dynamic Range Compression and Loudness Normalization
I would say in most cases successful Loudness Normalization for Broadcast compliance requires nothing more than a simple subtractive gain offset. For example if your mastered piece checks in at -20.0 LUFS (stereo), and you are targeting R128 (-23.0 LUFS Integrated), subtracting -3 LU of gain will most likely result in compliant audio. By doing so the original dynamic attributes of the piece will be retained.
Things get a bit more complicated when your Integrated Loudness target is higher than the measured source. For example a mastered -20.0 LUFS piece will require additional gain to meet a -16.0 LUFS target. In this case you may need to apply a significant amount of limiting to prevent the Maximum True Peak from exceeding your target. In essence without safeguards, added gain may result in clipping. The key is to avoid excessive limiting if at all possible.
How do we optimize audio before a gain offset is applied?
I recommend applying a moderate to low amount of (global) final stage Dynamic Range Compression before Loudness Normalization. When processing highly dynamic audio this final stage compression will prevent instances of excessive limiting. The amount of compression is of course subjective. Often a mere 1-2 dB of gain reduction will be sufficient. Effectiveness will always depend on the attributes of the mastered source audio before L.Normalizing.
I carefully manage spoken word dynamics throughout client project workflows. I simply maintain sufficient headroom prior to Loudness Normalization. In most cases I am able to meet the intended Integrated Loudness and Maximum True Peak targets (without limiting) by simply adding gain.
RX Loudness Control
By design iZotope’s RX Loudness Control also applies compression in certain instances of Loudness Normalization. I suggest you read through the manual. It is packed with information regarding audio loudness processing and Loudness Normalization.
iZotope states the following:
“For many mixes, dynamics are not affected at all . This is because only a fixed gain is required to meet the spec . However, if your mix is too dynamic or has significant transients, compression and/or limiting are required to meet Short-term/Momentary or True Peak parts of the spec.”
“RX Loudness Control uses compression in a way that preserves the quality of your audio . When needed, a compressor dynamically adjusts your audio to ensure you get the best sound while remaining compliant . For loudness standards that require Short-term or Momentary compliance, the compressor is engaged automatically when loudness exceeds the specified target.”
It’s a highly recommended tool that simplifies offline processing in Pro Tools. Many of it’s features hook into Adobe’s Premiere Pro and Media Encoder.
LRA, PLR, and Measurement Tools
So how do we quantify spoken word audio dynamics? Most modern Loudness Meters are capable of calculating and displaying what is referred to as the Loudness Range (LRA). This particular descriptor is displayed in Loudness Units (LU’s). Loudness Range quantifies the differences in loudness measurements over time. This statistical perspective can help operators decide whether Dynamic Range Compression may be necessary for optimum intelligibility on a particular platform. (Note in order to prevent a skewed measurement due to various factors – the LRA algorithm incorporates relative and absolute threshold gating. For more information: refer to EBU Tech doc 3342).
I will say before I came across sort of rule of thumb (recommended) guidelines for Internet and Mobile audio distribution, the LRA in the majority of the work that I’ve produced over the years hovered around 3-5 LU. In the highly regarded article Audio for Mobile TV, iPad and iPod, the author and leading expert Thomas Lund of TC Electronic suggests an LRA not much higher than 8 LU for optimal Pod Listening. Basically higher LRA readings suggest inconsistent dynamics which in turn may not be suitable for Mobile platform distribution.
Some Loudness Meters also display the PLR descriptor, or Peak to Loudness Ratio. This correlates with headroom and dynamic range. It is the difference between the Program (average) Loudness and maximum amplitude. Assuming a piece of audio has been Loudness normalized to -16.0 LUFS along with an awareness of a True Peak Maximum somewhere around -1.0 dBTP, it is easy to recognize the general sweet spot for the Mobile platform ->> (e.g. a PLR reasonably less than 16 for stereo).
Note that heavily compressed or aggressively limited (loud) audio will exhibit very low PLR readings. For example if the measured Integrated Loudness of a particular program is -10.0 LUFS with a Maximum True Peak of -1.0 dBTP, the reduced PLR (9) clearly indicates aggressive processing resulting in elevated perceptual loudness. This should be avoided.
If you are targeting -16.0 LUFS (Integrated), and your True Peak Maximum is somewhere between -1.0 and -3.0 dBTP, your PLR is well within the recommended range.
An optimal LRA is vital for Podcast/Spoken Word distribution. Use it to gauge delivery consistency, dynamics, and whether further optimization may be necessary. At this point in time I suggest adhering to an LRA < 7 LU for spoken word.
LRA Measurements may be performed in real time using a compliant Loudness Meter such as Nugen Audio’s VisLM 2, TC Electronic’s LM2n Loudness Radar, and iZotope’s Insight (also check out the Youlean Loudness Meter). Some meters are capable of performing offline measurements in supported DAWs. There are a number of stand alone third party measurement options available as well, such as iZotope’s RX7 Advanced Audio Editor, Auphonic Leveler, FFmpeg, and r128x.
***Please note I personally paid for my RX Loudness Control license and I have no formal affiliation with iZotope.
I thought I’d clear up a few misconceptions regarding the Multiband Compressor bundled in Adobe Audition. Also, I’d like to discuss the infamous “Broadcast” preset that I feel is being recommended without proper guidance. This is an aggressive preset that applies excessive compression and heavy limiting resulting in processed audio that is often fatiguing to the listener.
The tool itself is “Powered by iZotope.” They are a well respected audio plugin and application development firm. Personally I think it’s great that Adobe decided to bundle this processor in Audition. However, it is far from a novice targeted tool. In fact it’s pretty robust.
What’s interesting is it’s referred to as a “Multiband Compressor.” This is slightly misleading, considering the processor includes a Peak Limiter stage along with it’s advertised Multiband Compressor. I think Dynamics Processor would be a more suitable name.
Basically the multi-band Compressor includes 3 adjustable crossovers, resulting in 4 independent Frequency Bands. Each Band includes a discrete Compressor with Threshold, Gain Compensation, Ratio, Attack, and Release settings. Bands can be soloed or bypassed.
There is global Peak Limiter module located to the right of the Compressor settings. This module may be activated or bypassed. Without a clear understanding of the supplied settings for the Limiter, you run the risk of generating excessive loudness when processing audio. I’m referring to a substantial increase in perceived loudness.
The Limiter Parameters
The Threshold is the limiting trigger. When the input signal surpasses it, limiting is activated. The Margin is what defines the Peak Ceiling. As you decrease the Threshold, the signal is driven up to and against the Margin resulting in an increase in average loudness. This also results in dynamic range reduction.
Activating the “Brickwall Limiter” feature in the supplemental Options module will ensure accurate Margin compliance. In essence you will be implementing Hard Limiting. Deactivating this option may result in “overs” and/or peaks that exceed the specified Margin.
The bundled Broadcast preset defaults the Limiter Threshold setting to -10.0 dB with a Margin of -0.1 dBFS. Any alternative Threshold settings are of course subjective. I’m suggesting that it may be a good idea to ease up on this default Threshold setting. This will result in less aggressive limiting and a reduction of average levels.
I’m also suggesting that the default Margin setting of -0.1 is not recommended in this context. I would set this to -1.0 dBFS or lower (-1.5 dBFS, or even -2.0 dBFS).
Please note this is not a True Peak Limiter. Your processed lossless audio file has the potential to loose headroom when and if it is converted to a lossy codec such as MP3.
At this point I suggest no changes should be made to the Attack and Release settings.
We cannot discount additional settings included in the Broadcast preset that are contributing to the aggressive processing. If you examine the Ratio settings for each independent compression module, 3:1 is the highest set Ratio. The predefined Ratios are fairly moderate and for starters require no adjustment.
However, notice the Threshold settings for each compression module as well as the Gain Compensation setting in Module (band) 4 (+3 dB).
First, the low Threshold settings result in fairly aggressive compression per band. Also, the band 4 gain compensation is generating a further increase in average level for that particular band.
Again the settings and any potential adjustments are subjective. My recommendation would be to experiment with the Threshold settings. Specifically, cut back by reducing all Thresholds while maintaining their relative relationship. Do this by activating the “Link Band Controls” setting located in the supplemental Limiter Options.
View the red Gain Reduction meters included in each module. Monitor the amount of attenuation that occurs with the default Threshold settings. Compare initial readings with the gain reduction that occurs after you make your adjustments. Your goal is to ease up on the gain reduction. This will result in less aggressive compression. Remember to use your ears!
An area of misinformation for this processor is the purpose of the Output Gain adjustment, located at the far upper right of the interface. Please note this setting does not define the Peak Ceiling! Remember – it is the Margin setting in the Limiter module that defines your Ceiling. The Output Gain simply adds or cuts global output level after compression. Think of if it as Global Gain compensation.
To prove my point, I dug out a short video demo that I created sometime last year for a community member.
With the Broadcast preset selected, and the Output Gain set to -1.5 dBFS – the actual output Peak Amplitude surpasses -1.5 dBFS, even with the Brickwall option turned ON. This reading is displayed numerically above the Output Gain meter(s) in real time.
In the second pass of the test I set the Output Gain to 0 dBFS. I then set the Limiter Margin to -1.5 dBFS. As the audio plays through you will notice the output is limited to and never surpasses -1.5 dBTP. Just keep your eye on the numerical, realtime display.
I purposely omitted any specific references to Attack and Release settings. They are the source for a future discussion.
Here’s an alternative use recommendation for this Adobe Multiband Compressor: DeEssing.
Use the Spectrum Analyzer to determine the frequency range where excessive sibilant energy occurs. Set two crossovers to encapsulate this range. Bypass the remaining associated compression modules. Tweak the remaining active band compression settings thus allowing the compressor to attenuate the problematic sibilant energy.
If you find the supplied Spectrum Analyzer difficult to read, consider using a third party option with higher resolution to perform your analysis.
Please note – in order to get the most out of this tool, you really need to learn and understand the basics of dynamics compression and how each setting will affect the source audio. More importantly, when someone simply suggests the use of a preset, take it with a grain of salt. More than likely this person lacks a full understanding of the tool, and may not be capable of providing clear instructional guidance for all functions. It’s a bad mix – especially when charging novices big bucks for training.
By the way, nothing wrong with being a novice. The point is paid consultants have an obligation to provide expert assistance. Boiler plate suggestions serve no purpose.
Two copies of an audio file. File 1 is Stereo, Loudness Normalized to -16.0 LUFS. File 2 is Mono, also Loudness Normalized to -16.0 LUFS.
Passing both files through a Loudness Meter confirms equal numerical Program Loudness. However the numbers do not reflect an obvious perceptual difference during playback. In fact the Mono file is perceptually louder than it’s Stereo counterpart.
Why would the channel configuration affect perceptual loudness of these equally measured files?
I’m going to refer to a feature that I came across in a Mackie Mixer User Manual. Mackie makes reference to the “Constant Loudness” principle used in their mixers, specifically when panning Mono channels.
On a mixer, hard-panning a Mono channel left or right results in equal apparent loudness (perceived loudness). It would then make sense to assume that if the channel was panned center, the output level would be hotter due to the combined or “mixed” level of the channel. In order to maintain consistent apparent loudness, Mackie attenuates center panned Mono channels by about 3 dB.
We can now apply this concept to the DAW …
A Mono file played back through two speakers (channels) in a DAW would be the same as passing audio through a Mono analog mixer channel panned center. In this scenario, the analog mixer (that adheres to the Constant Loudness principle) would attenuate the output by 3dB.
In order to maintain equal perception between Loudness Normalized Stereo and Mono files targeting -16.0 LUFS, we can simulate the Constant Loudness principle in the DAW by attenuating Mono files by 3 LU. This compensation would shift the targeted Program Loudness for Mono files to -19.0 LUFS.
To summarize, if you plan to Loudness Normalize to the recommend targets for internet/mobile, and Podcast distribution … Stereo files should target -16.0 LUFS Program Loudness and Mono files should target -19.0 LUFS Program Loudness.
Note that In my discussions with leading experts in the space, it has come to my attention that this approach may not be sustainable. Many pros feel it is the responsibility of the playback device and/or delivery system to apply the necessary compensation. If this support is implemented, the perceived loudness of -16.0 LUFS Mono will be equal to -16.0 LUFS Stereo. There would be no need to apply manual compensation.
“Working group members believe that one solution may lie in promoting the use of Loudness Meters, which offer more precision by measuring audio levels numerically. Most shows are now mixed using peak meters, which are less exact.”
Peak Meters are exact – when they are used to display what they are designed to measure:Sample Peak Amplitude. They do not display an accurate representation of average, perceived loudness over time. They should only be used to monitor and ultimately prevent overload (clipping).
It’s great that the people in Public Radio are finally addressing distribution Loudness consistency and compliance. My hope is their initiative will carry over into their podcast distribution models. In my view before any success is achieved, a full understanding of all spec. descriptors and targets would be essential. I’m referring to Program (Integrated) Loudness, Short Term Loudness, Momentary Loudness, Loudness Range, and True Peak.
A Loudness Meter will display all delivery specification descriptors numerically and graphically. Meter descriptors will update in real time as audio passes through the meter.
Short Term Loudness values are often displayed from a graphical perspective as designed by the developer. For example TC Electronic’s set of meters (with the exception of the LM1n) display Short Term Loudness on a circular graph referred to as Radar. Nugen Audio’s VisLM meter displays Short Term Loudness on a grid based histogram. Both versions can be customized to suit your needs and work equally well.
Loudness Meters also include True Peak Meters that display any occurrences of Intersample Peaks.
All Loudness standardization guidelines specify a Program Loudness or “Integrated Loudness” target. This time scaled descriptor indicates the average, perceived loudness of an entire segment or program from start to finish. It is displayed on an Absolute scale in LUFS (Loudness Units relative to Full Scale), or LKFS (Loudness Units K Weighted relative to Full Scale). Both are basically the same. LUFS is utilized in the EBU R128 spec. and LKFS is utilized in the ATSC A/85 spec. What is important is that a Loudness Meter can display Program Loudness in either LUFS or LKFS.
The Short Term Loudness (S) descriptor is measured within a time window of 3 seconds, and the Momentary Loudness (M) descriptor is measured within a time window of 400 ms.
The Loudness Range (LRA) descriptor can be associated with dynamic range and/or loudness distribution. It is the difference between average soft and average loud parts of an audio segment or program. This useful indicator can help operators decide whether dynamic range compression is necessary.
The specification Gate (G10) function temporarily pauses loudness measurements when the signal drops below a relative threshold, thus allowing only prominent foreground sound to be measured. The relative threshold is -10 LU below ungated LUFS. Momentary and Short Term measurements are not gated. There is also a -70 LUFS Absolute Gate that will force metering to ignore extreme low level noise.
Absolute vs. Relative
I mentioned that LUFS and LKFS are displayed on an Absolute scale. For example the EBU R128 Program Loudness target is -23.0 LUFS. For Podcast/Internet/Mobile the Program Loudness target is -16.0 LUFS.
There is also a Relative scale that displays LU’s, or Loudness Units. A Relative LU scale corresponds to an Absolute LUFS/LKFS scale, where 0 LU would equal the specified Absolute target. In practice, -23 LUFS in EBU R128 is equal to 0 LU. For Podcast/Mobile -16.0 LUFS would also be equal to 0 LU. Note that the operator would need to set the proper Program Loudness target in the Meter’s Preferences in order to conform.
LU and dB Relationship
1 LU is equal to 1 dB. So for example you may have measured two programs: Program A checks in at -20 LUFS. Program B checks in at -15 LUFS. In this case program B is +5 LU louder than Program A.
Loudness Meter plugins mainly support online (Real Time) measurement of an audio signal. For an accurate measurement of Program Loudness of a clip or mixed segment the meter must be inserted in the DAW at the very end of a processing chain, preferably on the Master channel. If the inserts on the Master channel are post fader, any change in level using the Master Fader will result in a global gain offset to the entire mix. The meter would then (over time) display the altered Program Loudness.
If your DAW’s Master channel has pre fader inserts, the Loudness Meter should still be inserted on the Master Channel. However the operator would first need to route the mix through a Bus and use the Bus channel fader to apply global gain offset. The mix would then be routed to the Master channel where the Loudness Meter is inserted.
If your DAW totally lacks inserts on the Master channel, Buses would need to be used accordingly. Setup and routing would depend on whether the buses are pre or post fader.
Some Loudness Meter plugins are capable of performing offline measurements in certain DAW’s on selected regions and/or clips. In Pro Tools this would be an Audio Suite process. You can also accomplish this in Logic Pro X by initiating and completing an offline bounce through a Loudness Meter.
In my previous article I discussed various aspects of the Match Volume Processor in Adobe Audition CC. I mentioned that the ITU Loudness processing option must be used with care due to the lack of support for a user defined True Peak Ceiling.
Here’s how to implement the off-line processing version in Audition CC …
This is a snapshot of a stereo version of what may very well be the second most popular podcast in existence:
Amplitude Statistics in Audition:
True Peak Amplitude:0.18dBTP
ITU Loudness:-15.04 LUFS
It appears the producer is Peak Normalizing to 0dBFS. In my opinion this is unacceptable. If I was handling post production for this program I would be much more comfortable with something like this at the source:
Amplitude Statistics in Audition:
True Peak Amplitude:-0.81dBTP
ITU Loudness:-15.88 LUFS
We will be shooting for the Internet/Mobile/Podcast target of -16.0 LUFS Program Loudness with a suitable True Peak Ceiling.
The first step is to run Amplitude Statistics and determine the existing Program Loudness. In this case it’s -15.88 LUFS. Next we need to Loudness Normalize to -24.0 LUFS. We do this by simply calculating the difference (-8.1) and applying it as a Gain Offset to the source file.
The next step is to implement a static processing chain (True Peak Limiter and secondary Gain Offset) in the Audition Effects Rack. Since these processing instances are static, save the Effects Rack as a Preset for future use.
Set the Limiter’s True Peak Ceiling to -9.5dBTP. Set the secondary Gain Offset to +8dB. Note that the Limiter must be inserted before the secondary Gain Offset.
Process, and you are done.
In this snapshot the upper waveform is the Loudness Normalized source (-24.0 LUFS). The lower waveform in the Preview Editor is the processed audio after it was passed through the Effects Rack chain.
In case you are wondering why the Limiter is before the secondary Gain instance – in a generic sense, if you start with -9.5 and add 8, the result will always be -1.5. This translates into the Limiter doing it’s job and never allowing the True Peaks in the audio to exceed -1.5dBTP. In essence this is the ultimate Ceiling. Of course it may be lower. It all depends on the state of the source file.
This last snapshot displays the processed audio that is fully compliant, followed by it’s Amplitude Statistics:
[– Determine Program Loudness of the source (Amplitude Statistics).
[– Loudness Normalize (Gain Offset) to -24.0 LUFS.
[– Run your saved Effects Rack chain that includes a True Peak Limiter (Ceiling set to -9.5dBTP) and a secondary +8dB Gain Offset.
*** UPDATE: Please note this post was written in 2014. The current version of Adobe Audition CC has been greatly enhanced, specifically in regards to the Match Loudness Module. It is now possible to define a True Peak Maximum, as well as Integrated/Program Loudness targets. It is also possible to customize Loudness Normalization Tolerence.
Adobe Audition CC has a handy Match Volume Processor with various options including Match To/ITU-R BS.1770-2 Loudness. The problem with this option is the Processor will not allow the operator to define a True Peak Ceiling. And so depending on various aspects of the input file, it’s possible the processed audio may not comply due to an unsuitable Peak Ceiling.
For example if you need to target -16.0 LUFS Program Loudness for internet/mobile distribution, the Match Volume Processor may need to increase gain in order to meet this target. Any time a gain increase is applied, you run the risk of pushing the Peak Ceiling to elevated levels.
The ITU Loudness processing option does supply a basic Limiting option. However – it’s sort of predefined. My tests revelaled Peak Ceilings as high as -0.1dBFS. This will result in insufficient headroom for both True Peak compliance and preparation for MP3 encoding.
The Audition Match Volume Processor also features a Match To/True Peak Amplitude option with a user defined True Peak Ceiling (referred to as Peak Volume). This is essentially a True Peak Limiter that is independent of the ITU Loudness Processor. For Program Loudness and True Peak compliance, it may be necessary to run both processing stages sequentially.
There are a few caveats …
[– If the Match Volume Processor (Match To/ITU-R BS.1770-2 Loudness) applies limiting that results in a Peak Ceiling close to full scale, any subsequent limiting (Match To/True Peak Amplitude) has the potential to reduce the existing Program Loudness.
[– If a Match Volume process (Match To/ITU-R BS.1770-2 Loudness) yields a compliant True Peak Ceiling right out of the box, there is no need to run any subsequent processing.
If you are going to use these processing options, my suggestion would be to make sure the measured Program Loudness of your input file is reasonably close to the Program Loudness that you are targeting. Also, make sure the input file has sufficient headroom, with existing True Peaks well below 0dBFS.
If you are finding it difficult to achieve acceptable results, I suggest you apply the concepts described in this video tutorial that I produced. I demonstrate a sort of manual “off-line” Loudness Normalization process. If you prefer to handle this in real time (on-line), refer to my article “Podcast Loudness Processing Workflow.”
Below is Elixir by Flux. This is an ITU-R BS.1770/EBU R128 compliant multichannel True Peak Limiter. It’s just one of the tools available that can be used in the workflow described below. In this post I also mention the ISL True Peak Limiter by Nugen Audio.
If you have any questions about these tools or Loudness Meters in general, ping me. In fact I think my next article will focus on the importance of learning how to use a Loudness Meter, so stay tuned …
In my previous post I made reference to an audio processing workflow recommended by Thomas Lund. The purpose of this workflow is to effectively process audio files targeting loudness specifications that are suitable for internet and mobile distribution. in other words – Podcasts.
“Mobile and computer devices have a different gain structure and make use of different codecs than domestic AV devices such as television. Tests have been performed to determine the standard operating level on Apple devices.
Based on 1250 music tracks and 210 broadcast programs, the Apple normalization number comes out as -16.2 LKFS (Loudness, K-weighted, relative to Full Scale) on a BS.1770-3 scale.
It is, therefore, suggested that when distributing Podcast or Mobile TV, to use a target level no lower than -16 LKFS. The easiest and best-sounding way to accomplish this is to:
[– Normalize to target level (-24 LKFS)
[– Limit peaks to -9 dBTP (Units for measurement of true peak audio level, relative to full scale)
[– Apply a gain change of +8 dB
Following this procedure, the distinction between foreground and background isn’t blurred, even on low-headroom platforms.”
Here is my interpretation of the steps referenced in the described workflow:
Step 1 – Normalize to target level -24.0 LUFS. (Notice Mr. Lund refers to LKFS instead of LUFS. No worries. Both are the same. LKFS translates to Loudness Units K-Weighted relative to Full Scale).
So how do we accomplish this? Simple – the source file needs to be measured and the existing Program Loudness needs to be established. Once you have this descriptor, it’s simple math. You calculate the difference between the existing Program Loudness and -24.0. The result will give you the initial gain offset that you need to apply.
I’ll point to a few off-line measurement utilities at the end of this post. Of course you can also measure in real time (on-line). In this case you would need to measure the source in it’s entirety in order to arrive upon an accurate Program Loudness measurement.
Keep in mind since random Program Loudness descriptors at the source will vary on a file to file basis, the necessary gain offset to normalize will always be different. In essence this particular step is variable. Conversely steps 2 and 3 in the workflow are static processes. They will never change. The Limiter Ceiling will always be -9.0 dBTP, and the final gain stage will always be + 8dB. The -16.0 LUFS target “math” will only work if the Program Loudness is -24.0 LUFS at the very beginning from file to file.
Think about it – with the Limiter and final gain stage never changing, – if you have two source files where file A checks in at -19.0 LUFS and File B checks in at -21.0 LUFS, the processed outputs will not be the same. On the other hand if you always begin with a measured Program Loudness of -24.0 LUFS, you will be good to go.
[– If your source file checks in at -20.0 LUFS … with -24.0 as the target, the gain offset would be -4.0 dB.
[– If your source file checks in at -15.6 LUFS … with -24.0 as the target, the gain offset would be -8.4 dB.
[– If your source file checks in at -26.0 LUFS … with -24.0 as the target, the gain offset would be +2.0 dB.
[– If your source file checks in at -27.3 LUFS … with -24.0 as the target, the gain offset would be +3.3 dB
In order to maintain accuracy, make sure you use the float values in the calculation. Also – it’s important to properly optimize the source file (see example below) before performing Step 1. I’m referring to dynamics processing, equalization, noise reduction, etc. These options are for the most part subjective. For example if you prefer less compression resulting in wider dynamics, that’s fine. Handle it accordingly.
Moving forward we’ve established how to calculate and apply the necessary gain offset to Loudness Normalize the source audio to -24.0 LUFS. On to the next step …
Step 2 – Pass the processed audio through a True Peak Limiter with it’s Peak Ceiling set to -9.0 dBTP. Typically I set the Channel or “Stereo” Link to 100%, limiting Look Ahead to 1.5ms and Release Time to 150ms.
Step 3 – Apply +8dB of gain.
You can set this up as an on-line process in a DAW, like this:
I’m using the gain adjustment feature in two instances of the Avid Time Adjuster plugin for the initial and final gain offsets. The source file on the track was first measured for Program Loudness. The necessary offset to meet the initial -24.0 LUFS target was -4 dB.
The audio then passes through the Nugen ISL True Peak Limiter with it’s Peak Ceiling set to -9.0 dBTP. Finally the audio is routed through the second instance of the Adjuster plugin adding +8 dB of gain. The Loudness meter displays the Program Loudness after 5 minutes of playback and will accurately display variations in Program Loudness throughout. Bouncing this session will output to the Normalized targets.
Note that you can also apply the initial gain offset, the limiting, and the final gain offset as independent off-line processes. The preliminary measurement of the audio file and gain offset are still required.
Review the file attributes:
The audio is fairly dynamic. So I apply an initial stage of compression:
Next I apply additional processing options that I feel are necessary to create a suitable intermediate. I reiterate these processing options are entirely subjective. Your desire may be to retain the Loudness Range and/or dynamic attributes present in the original file. If so you will need to process the audio accordingly.
Here is the intermediate:
The Program Loudness for this intermediate file is -20.2 LUFS. The initial gain offset required would be -3.8 dB before proceeding.
After applying the initial gain offset, pass the audio through the limiter, and then apply the final gain stage.
This is the resulting output:
That’s about it. We’re at -16.0 LUFS with a suitable True Peak Max.
I’ve experimented with this workflow countless times and I’ve found the results to be perfectly acceptable. As I previously stated – preparation of your source or intermediate file prior to implementing this three step process is subjective and totally up to you. The key is your output will always be in spec..
Offline Measuring Tools
I can recommend the following tools to measure files “off-line.” I’m sure there are many other options:
[– Auphonic Leveler Batch Processor. I don’t want to discount the availability and effectiveness of the products and services offered by Auphonic. It’s a highly recommended web service and the standalone application that includes high quality audio processing algorithms including Loudness Normalization.
In my No Free Pass for Podcasts post I talked about why the Broadcast Loudness specs. are not necessarily suitable for Podcasts. I noted that the Program Loudness targets for EBU R128 and ATSC A/85 are simply too low for internet and mobile audio distribution. Add excessively dynamic audio to the mix and it will complicate matters further, especially when listeners use mobile devices to consume their media in less than ideal ambient spaces.
Earlier today I was discussing this issue with someone who is well versed in all aspects audio production and loudness processing. He noted that ” … the consensus of it all is, that it is a bad idea to take a really nice standard that leaves plenty of headroom and then start creating new standards with different reference values.” The fix would be to “keep production and storage at -23.0 LUFS and then adjust levels in distribution.” Valid points indeed. However in the real world this mindset is unrealistic, especially in the internet/mobile/Podcasting space.
The fact of the matter is there is no way to avoid the necessity to revise the standards that simply do not work on a platform that consists of unique variables.
And so considering these variables, the implementation of thoughtful, revised, best practices that include platform specific targets for Program Loudness, Loudness Range, and True Peak are unavoidable. Independent Podcasters and network driven Podcasts using arbitrary production techniques and delivery methods simply need direction and guidance in order to comply. In the end it’s all about presenting well produced media to the listener.
Recently I came across a tweet where someone stated “I love the show but it is consistently too quiet to listen to on my phone.” They were referring to the NPR program Fresh Air. I’m not exactly sure if this person was referring to the radio broadcast stream or the distributed Podcast. Either way it’s an interesting assertion that I can directly relate to.
I subscribe to the Fresh Air Podcast. This will probably not surprise you – I refuse to listen to the Podcast right out of the box. When a new show pops up in Instacast, I download the file, decode to WAV, convert to stereo, and then reprocess the audio. I tweak the dynamic range and address show participant audio level variations using various plugins. I then bump things up to -16.0 LUFS (using what I like to refer to as “The Lund Method”) while supplying enough headroom to comply with -1.0 dBTP as my ultimate ceiling. I’ll get into the specifics in a future post.
According to the leading expert Mr. Thomas Lund:
“Mobile and computer devices have a different gain structure and make use of different codecs than domestic AV devices such as television. Tests have been performed to determine the standard operating level on Apple devices. Based on 1250 music tracks and 210 broadcast programs, the Apple normalization number comes out as -16.2LKFS (Loudness, K-weighted, relative to Full Scale) on a BS.1770-3 scale.
It is, therefore, suggested that when distributing podcast or Mobile TV, to use a target level no lower than -16LKFS. The easiest and best-sounding way to accomplish this is to: 1) Normalize to target level (-24LKFS); 2) Limit peaks to -9dBTP (Units for measurement of true peak audio level, relative to full scale); and 3) Apply a gain change of +8dB. Following this procedure, the distinction between foreground and background isn’t blurred, even on low-headroom platforms.”
In this snapshot I demonstrate the described workflow. I’m using two independent instances of the bx_control plugin to apply the gain offsets at various stages of the signal flow. After the initial calculated offset is applied, the audio is routed through the Elixr True Peak Limiter and then out through the second instance of bx_control applying +8dB of static gain. You can also replicate this workflow on an off-line basis. Note that I’ve slightly altered the limiting recommendation.
So why do I feel the need to do this?
These are the specs. and the waveform overview of a recently published Fresh Air Podcast in it’s entirety:
Next is a 3 min. audio segment lifted from the published Podcast. The stats. display measurements of the attached 3 min. segment:
Podcast Optimized for Internet/Mobile
Below is the same 3 min. segment. I reprocessed the audio to make it suitable for Podcast distribution. The stats. display measurements of the attached audio segment:
The difference between the published source audio and the reprocessed version is quite obvious. The Loudness Normalized audio is so much more intelligible and easier to listen to. In my view the published audio is simply out of spec. and unsuitable for a Podcast.
Bear in mind the condition of the source audio is not uncommon. The problems that persist are not exclusive to podcasts distributed by NPR or by any of their affiliates. Networks with global reach need to recognize their Podcast distribution platforms as important mechanisms to expand their mass appeal.
It has been noted that the Public Radio community in general is exploring ways to enhance the way in which they produce their programs with focus on loudness standardization. My hope hope is this carries over to their Podcast platforms as well.
I think it was in the mid to late 1980’s. I was still living home, totally fixated on what was happening with Television devices, programming and transmission. Mainly the advent of MTS Stereo compatible TV’s and VCR’s. I remember waiting patiently for weekly episodes of programs like Miami Vice and Crime Story to air. I would pipe the program audio through my media system in glorious MTS stereo. For me this was a game changer.
I also remember it was around the same time that Cable TV became available in the area. I convinced my Mom and Dad to allow me to order it. Initially it was installed on the living room TV, and eventually made it’s way on to additional TV’s throughout our home. For the most part it was a huge improvement in terms of reception and of course program diversity.
However there was one issue that struck me from the very beginning: the wide variations in loudness between network TV Shows, Movies, and Adverts. In fact it was common for targeted, poorly produced, and exceedingly loud local commercials to air repeatedly throughout broadcast transmissions. Reaching for the remote to apply volume attenuation was a common occurrence and a major annoyance.
Obviously this was not isolated. The issue was widespread and resulted in a public outcry to correct these inconsistencies. In 2010 The CALM Act was implemented. The United States and Europe (and many other regions) adopted and now regulate loudness standardization guidelines for the benefit of the public at large.
If there is anyone out there who cannot relate to this “former” problem, I for one would be very surprised.
Well guess what? We now have the same exact problem existing on the most ubiquitous media distribution platform in existence – the internet.
I realize any expectation of widespread audio loudness standardization on the internet would be unreasonable. There’s just too much stuff out there. And those who create and distribute the media possess a wide scope of skills. However there is one sort of passionate and now ubiquitous subculture that may be ripe for some level of standardization. Of course I’m referring to the thousands upon thousands of independenlty produced Podcasts available to the masses.
In the past I’ve made similar public references to the following exercise. Just in case you missed it, please try this – at you own risk!
Put on your headphones and queue up this episode of The Audacity to Podcast. Set your playback volume at a comfortable level, sit back, and enjoy. After a few minutes, and without changing your playback volume setting – queue up this episode of the Entrepreneur on Fire podcast.
Need I say more?
From what I gather both programs are quite popular and highly regarded. I have no intension of suggesting that either producer is doing anything wrong. The way in which they process their audio is their artistic right. On the other hand in my view there is one responsibility they both share. That would be the obligation to deliver well produced content to their subscribers, especially if the Podcast generates a community driven revenue stream. It’s the one thing they will always have in common. And so I ask … wouldn’t it make sense to distribute media following audio processing best practices resulting in some level of consistency within this passionate subculture?
I suspect that some Podcast producers purposely implement extreme Program Loudness levels in an attempt to establish “supremacy on the dial.” This issue also exists in radio broadcast and music production, although things have improved ever since Loudness War participants were called to task with the inception of mandatory compliance guidelines.
I’ve also noticed that many prolific Podcast Producers (including major networks) are publishing content with a total lack of Program Loudness consistency within their own catalogs form show to show. Even more troubling, Podcast aggregation networks rarely specify standardization guidelines for content creators.
It’s important to note that many people who consume audio delivered on the internet do so in less than ideal ambient spaces (automobiles, subways, airplanes etc.) using low-fi gear (ear buds, headphones, mobile devices, and compromised desktop near fields). Simply adopting the broadcast standards wouldn’t work. The existing Program Loudness targets are simply unsuitable, especially if the media is highly dynamic. The space needs revised specs. in order to optimize the listening experience.
Loudness consistency from a Podcast listener’s perspective is solely in the hands of the producers who create the content. In fact it is possible producers may even share common subscribers. Like I said – the space is ripe for standardization.
Currently loudness compliance recommendations are sparse within this massive community driven network. In my view it’s time to raise awareness. A target specification would universally improve the listening experience and ultimately legitimize the viability of the platform.
For the record, I advocate:
File Format: Stereo, 128kbps minimum. Program Loudness: -16.0 LUFS with acceptance of a reasonable deviation. Loudness Range: 8 LU, or less. True Peak Ceiling: -1.0 dBTP in the distribution file. Of course this may be lower.
Quick note: when I refer to Podcasts, from a general perspective I am referring to audio programs and videos/screencasts/tutorials that primarily consist of spoken word soundtracks. Music based Podcasts or cinema styled videos with high impact driven soundtracks may not necessarily translate well when the Loudness Range (and Dynamic Range) is constricted.
Waves has just released a stellar update to their critically acclaimed WLM Loudness Meter. The new WLM Plus version, available for free to those who are eligible – includes a few new and very useful features.
The plugin now acts as both a Loudness Meter and a Loudness Processor. New controls (Gain/Trim) are located in the Processing Panel and are designed to apply loudness normalization and correction. There is also a new switchable True Peak Limiter that adheres to the True Peak parameter defined in the selected running preset.
Here’s how it works:
Notice below I am running WLM Plus using my own custom preset (figg -16 LUFS). Besides the obvious Integrated Loudness target (-16 LUFS), I’ve defined -1.0 dBTP as my True Peak ceiling.
What you need to do is insert the plugin at the end of your chain. Turn on the True Peak Limiter. Now play through the entire segment that you wish to measure and correct. During playback the textField value located on the WLM Plus Trim button will update in realtime, displaying the proper amount of gain compensation that is necessary to meet the Integrated Loudness target (it’s +2.1 dB in this example).
When measurement is complete, simply press the Trim button. This will set the Gain slider to the proper value for accurate compensation. Finish up by bouncing the segment through WLM Plus, much the same as any processing plugin. The processed audio will now match the Integrated Loudness Preset target and True Peaks will be limited accordingly.
I haven’t tested this in Pro Tools but my guess is this also works when using WLM Plus as an Audio Suite process on individual clips.
Of course you can make a manual adjustment to the Gain slider as well. In this case you would use the displayed Trim Value to properly set the necessary amount of gain compensation.
Great update to this well designed Loudness Meter.
With the release of the Adobe “CC” versions of Audition and Premiere Pro, users now have access to a customized version of the tc electronic Loudness Radar Meter.
In this video from NAB 2013, an attendee asks an Adobe Rep: “So I’ve heard about Loudness Radar … but I don’t really understand how it works.”
I thought it would be a good idea to discuss the basics of Loudness Radar, targeting those who may not be too familiar with it’s design and function. Before doing so, there are a few key elements of loudness meters and measurement that must be understood before using Loudness Radar proficiently.
Loudness Measurement Specifications:
Program “Integrated” Loudness (I): The measured average loudness of an entire segment of audio.
Loudness Range (LRA): The difference between average soft and average loud parts of a segment.
True Peak (dBTP): The maximum electrical amplitude with focus on intersample peaks.
Meter Time Scales:
• Momentary (M) – time window:400ms
• Short Term (S) – time window:3sec.
• Integrated (I) – start to stop
Program Loudness Scales
Program Loudness is displayed in LUFS (Loudness Units Relative to Full Scale), or LKFS (Loudness K-Weighted Relative To Full Scale). Both are exactly the same and reference an Absolute Scale. The corresponding Relative Scale is displayed in LU’s (Loudness Units). 0 LU will equal the LUFS/LKFS Loudness Target. For more information please refer to this post.
LU’s can also be used to describe the difference in Program Loudness between two segments. For example: “My program is +3 LU louder than yours.” Note that 1 LU = 1 dB.
Meter Ranges (Mode/Scale)
Two examples of this would be EBU +9 and EBU +18. They refer to EBU R128 Meter Specifications. The stated number for each scale can be viewed as the amount of displayed loudness units that exceed the meter’s Loudness Target.
From the EBU R128 Doc:
1. (Range) -18.0 LU to +9.0 LU (-41.0 LUFS to -14.0 LUFS), named “EBU +9 scale”
2. (Range) -36.0 LU to +18.0 LU (-59.0 LUFS to -5.0 LUFS), named “EBU +18 scale”
The EBU +9 Range is well suited for broadcast and spoken word. EBU +18 works well for music, film, and cinema.
Loudness Compliance: Standardized vs. Custom
As you probably know two ubiquitous Loudness Compliance Standards are EBU R128 and ATSC A/85. In short, the Target Loudness for R128 is -23.0 LUFS with peaks not exceeding -1.0 dBTP. For ATSC A/85 it’s -24.0 LKFS, -2.0 dBTP. Compliant loudness meters include presets for these standards.
Setting up a loudness meter with a custom Loudness Target and True Peak is often supported. For example I advocate -16.0 LUFS, -1.5 dBTP for audio distributed on the internet. This is +7 or 8 LU hotter than the R128 and/or ATSC A/85 guidelines (refer to this document). Loudness Radar supports full customization options to suit your needs.
Loudness meters have “On and Off” switches, as well as a Reset function. For Loudness Radar – the Pause button temporarily halts metering and measurement. Reset clears all measurements and sets the radar needle back to the 12 o’clock position. Adobe Loudness Radar is mapped to the play/pause transport control of the host application.
The Loudness Standard options available in the Loudness Radar Settings designate Measurement Gating. In general, the Gate pauses the loudness measurement when a signal drops below a predefined threshold, thus allowing only prominent foreground sounds to be measured. This results in an accurate representation of Program Loudness. For EBU R128 the relative threshold is -10 LU below ungated LUFS. Momentary and Short Term measurements are not gated.
• ITU BS.1770-2 (G10) implements a Relative Gate at -10 LU and a low level Gate at -70 LU.
• Leq(K) implements a -70 LU low level Gate to avoid metering bias during 100% silent passages. This setting is part of the ATSC A/85 Specification.
In Audition CC you will find Loudness Radar located in Effects/Special/Loudness Radar Meter. It is also available in the Effects Rack and in the Audio Mixer as an Insert. Likewise it is available in Premiere Pro CC as an Insert in the Audio Track Mixer and in the Audio Effects Panel. In both host applications Loudness Radar can be used to measure individual clips or an entire mix/submix. Please note when measuring an audio mix – Loudness Radar must be placed at the very end of the processing chain. This includes routing your mix to a Bus in a multitrack project.
Most loudness meters use a horizontal graph to display Short Term Loudness over time. In the image below we are simulating 4 minutes of audio output. The red horizontal line is the Loudness Target. Since the simulated audio used in this example was not very dynamic, the playback loudness is fairly consistent relative to the Loudness Target. Program Loudness that exceeds the Loudness Target is displayed in yellow. Low level audio is represented in blue.
Each horizontal colored row represents 6 LU of audio output. This is the meter’s resolution.
Loudness Radar (click image below for high-res view) uses a circular graphic to display Short Term Loudness. A rotating needle, similar to a playhead tracks the audio output at a user defined speed anywhere from 1 minute to 24 hours for one complete rotation.
The circular LED meter on the perimeter of the Radar displays Momentary Loudness, with the user defined Loudness Target (or specification target) visible at the 12 o’clock position. The Momentary Range of the LED meter reflects what is selected in the Settings popup. The user can also customize the shift between green and blue colors by adjusting the Low Level Below setting.
The numerical displays for Program Loudness and Loudness Range will update in real time when metering is active. The meter’s Loudness Unit may be displayed as LUFS, LFKS, or LU. The Time display below the Loudness Unit display represents how long the meter is/was performing an active measurement (time since reset). Lastly the Peak Indicator LED will flash when audio peaks exceed the Peak Indicator setting.
If this is your first attempt to measure audio loudness using a loudness meter, focus on the main aspects of measurement:Program, Short Term, and Momentary Loudness. Also, pay close attention to the possible occurrence of True Peak overs.
In most cases the EBU R128 and ATSC A/85 presets will be suitable for the vast majority of producers. Setup is pretty straightforward:select the standardization preset that displays your preferred Loudness Unit (LUFS, LKFS, or LU’s) and fire away. My guess is you will find Loudness Radar offers clear and concise loudness measurements with very little fuss.
You may have noticed the Loudness Target used in the above graphic is -16.0 LUFS. This is a custom target that I use in my studio for internet audio loudness measurements.
Professional audio Loudness Meters measure Program (Integrated) Loudness using an Absolute scale displayed in LUFS (or LKFS). For example the EBU R128 Program Loudness target is -23.0 LUFS (Loudness Units Relative to Full Scale).
When the ITU defined new audio loudness measurement guidelines, the general consensus was that many audio engineers would prefer to mix to the familiar “0” level on a Loudness Meter for compliance targeting. A Relative scale option was implemented. It references Loudness Units (LU), where 0 LU equals the corresponding LUFS/LKFS compliance target.
Of course in most cases the scales and corresponding targets are customizable. For example I advocate -16.0 LUFS as the loudness target for audio distributed on the internet. By defining -16.0 LUFS as my Absolute scale compliance target in a meter’s setup options, 0 LU (Relative scale) would be equivalent to -16.0 LUFS.
Below is a basic side by side comparison of EBU R128 Absolute and Relative scales:
Wide variations in average (Program/Integrated) Loudness are common across all forms of audio distributed on the internet. This includes audio Podcasts, Videocasts, and Streaming Media. This is due to the total lack of any standardized guidelines in the space. Need proof? Head over to Twit.tv and listen to a few minutes of any one of their programs. Use headphones, and set your playback volume to a comfortable level.
Now head over to PodcastAnswerMan.com, and without making any change to your playback volume – listen to the latest program.
I rest my case.
In fact, there is a 10 LU difference in average loudness between the two. Twit.tv programs check in at approximately -22 LUFS. PodcastAnswerMan checks in at approximately -12 LUFS. I find this astonishing, but I am not surprised. I’m not signaling them out for any lack of quality issues or anything like that. In my view both networks do a great job, and my guess is they have sizable audiences. Both shows are well produced and it simply makes sense to compare them in this case study.
With all this in mind let me stress that at this particular time I am not going to focus on discussing Program Loudness variations or any potential suggested standard. I can assure you this is coming! I will say that I advocate -16.0 LUFS (Program/Integrated Loudness) for all media formats distributed on the internet. Stay tuned for more on this. For now I would like to discuss True Peak compliance that will be a vital part of any recommended distribution standard.
What surprises me more than Program Loudness inconsistency is just how many producers are pushing files with clipped, distorted audio. In many cases Intersample Peaks are present in audio files that have been normalized to 0 dBFS. (For more information on Intersample Peaks please refer to this brief explanation). Producers need to correct this problem before their audio is distributed.
One of the most useful features included in Adobe Audition is the Match Volume Processor. This tool includes various options that allow the operator to “dial in” specific average loudness and peak amplitude targets. After processing, the operator can examine the results by using Audition’s Amplitude Statistics analysis to check for accuracy.
Notice in the snapshot above I set the processor to Match To: Total RMS, with a -18.50 dB RMS average target. I’ve also selected the Use Limiting option. I’m able to dial in custom Look-Ahead and Release Time parameters as I see fit. Is there something missing? Indeed there is. Any time you push average levels you run the risk of clipping the source. In Audition the Match Volume/Use Limiting option lacks the capability for the operator to set a specific Peak Amplitude Ceiling. I’ve determined that in certain situations Peak Amplitudes reach a -0.1 dB ceiling resulting in possible clipped samples and True Peak levels that exceeded 0dBFS. Keep in mind this is not always the case. The results depend on the Dynamic Range and available Headroom of any source.
So how do we handle it?
Notice above the Match Volume Processor offers two Peak Amplitude options: Peak Amplitude and True Peak Amplitude. The European Broadcasting Union’s EBU R128 spec. dictates -1.0 dBTP (True Peak) as the ultimate ceiling to meet compliance. Here in the states ATSC A/85 dictates -2.0 dBTP. Since most, if not all audio formats distributed on the internet are delivered in lossy formats, it is important to pay close attention to True Peak Amplitude for both source (lossless) and distribution (lossy) files.
I advocate -1.0 dBTP as the standard for internet based audio file delivery. True Peak Limiters are able to detect and alleviate the possibility of Intersample Peaks from occurring. It is recommended to pass audio through a True Peak compliant limiter after loudness normalization and prior to lossy encoding. Options include ISL by Nugen Audio, Elixir by Flux, and (the best kept secret out there) TB Barricade by ToneBoosters. If you are running Audition, Match To: True Peak Amplitude and you should be all set.
The plugin developers mentioned above as well as Waves, MeterPlugs, tc electronic, Grimm Audio, and iZotope supply Loudness Meters and toolsets that display all aspects of loudness specifications including True Peak alerts. Visit this page for a list of supported Loudness Meters.
If True Peak detection and compliance is not within your reach due to the lack of capable tools, a slightly more aggressive ceiling (-1.5 dBFS) is recommended for Peak Normalization. The additional .5 dB acts as a sort of safety net, insuring maximum peak amplitude remains at or below -1.0 dBFS. One thing to keep in mind … performing Peak Amplitude Normalization after Loudness Normalization may very well result in a reduction in average, program loudness. Once again changes to the processed audio will depend on the audio attributes prior to Peak Normalizing.
Below I’ve supplied data that supports what I noted above. The table displays three iterations of a test file: Input, Loudness Normalized Intermediate, and final Output. For this test I used the ITU-R BS.1770-2 “Match To” option in Audition’s Match Volume Processor. I pushed the average target to -16.0 LUFS. As noted, this is the target that I advocate for internet and/or mobile audio. This target is +7 LU hotter than R128 and +8 LU hotter than ATSC A/85.
After processing the Input file, the average target was met in the Intermediate file, but True Peak overs occurred. The Intermediate file was then passed through a compliant True Peak Limiter with it’s ceiling set to -1.0 dBTP. Compliance was met in the Output with a minimal reduction in Program Loudness.
Producers: there is absolutely no excuse if your audio contains distortion due to clipping! At the very least you should Peak Normalize to -1.5 dBFS prior to encoding your lossy MP3. Every audio application on the planet offers the option to Peak Normalize, including GarageBand and Audacity. Best case scenario is to adopt True Peak compliance and learn how to use the tools that are necessary to get it done. If you are an experienced producer or professional, and you come across content that does not comply – reach out and offer guidance.