Recording Multiple Skype Clients On A Single Host System

It is possible to record two (or more) independently connected Skype clients on discrete tracks on a single computer in RT. The workflow requires independent Mix-Minus feeds configured in a supported DAW such as Pro Tools or Logic Pro.

Plausible Session Senarios:

(Scenario A) Typical Podcast consisting of a Host + Skype Guest + Skype Guest. Dual Mix-Minus feeds are implemented in the Host’s DAW. All participants recorded on discrete tracks in RT utilizing two individual incoming Skype clients running simultaneously on the Host system.

(Scenario B) Engineer + Skype Session Participant + Skype Session Participant. Dual Mix Minus feeds are implemented in the Host’s DAW. Both participants recorded on discrete tracks utilizing two individual incoming Skype clients running simultaneously on the Host system.

Scenario B describes an engineering session providing support for independently located remote Skype participants who seek recording and post services. The workflow frees the participants from recording responsibilities and file management.

As noted both Scenarios require the use of two individual Skype clients running simultaneously on the Host/Engineer’s system. This concept is publicly documented using various methods. In fact our good friend Mike Phillips describes an example workflow in this article.

What differentiates my workflow is the use of virtual routing within the Recording Session on a single machine. Dual Mix-Minus feeds are implemented in the Host’s DAW with zero dependency on hardware Aux Sends.

Loopback by Rogue Amoeba is used to create Virtual Devices and Pass-Thru’s. They will be encapsulated in an Aggregate Audio Device created in OSX. Additionally, my working Motu Audio Interface (8×8) will be added to the Aggregate Device for maximum flexibility.

Dual Mix-Minus

The intent of a single Mix-Minus feed is to send a Host’s audio back to a Session participant. This is commonly implemented on a hardware mixer or console using an Aux Send. It is nothing more than a discrete audio output with a level control.

When adding a second participant, the Host’s audio is routed to both participants using two Aux Sends (A), (B). The implemented Sends are also used to establish communication between the included participants.

For example:

Send (A) contains the Host + Participant 1 —-> signal is routed to Participant 2
Send (B) contains the Host + Participant 2 —-> signal is routed to Participant 1

Virtual Device Creation

The following I/O configuration is necessary for the described Host/Engineer + Skype 1 + Skype 2 scenario:

3 Mono Inputs: [Host] + [Skype Client 1] + [Skype Client 2]
2 Mono Outputs: [Host/Skype Client 1] + [Host/Skype Client 2]

Additional output routing will be necessary for monitoring and external recording. We will address this in a moment.

Please review the following I/O Matrix table:

Column 1 lists six Virtual Devices created in Rogue Amoeba’s Loopback application. Column 2 lists their associated user defined names.

• An initial Motu Audio Interface instance is created with inputs/outputs 1+2 mapped for use. Input 1 will represent the Host Mic.

• Four individual (Mono) Pass-Thru Devices are created:

Input 4 will be mapped to Skype Client 1
Input 6 will be mapped to Skype Client 2

Output 3 will include [Host + Skype Client 2]
Output 5 will include [Host + Skype Client 1]

• A secondary Motu instance is created with all available inputs/outputs mapped for use (8×8 by default). This will supply additional routing flexibility for monitoring and external recording. In fact the I/O Matrix table displays the use of outputs 13+14 for the Cue Monitor Mix (Phones).

Note the Inputs and Outputs are purposely alternated to prevent direct patching and subsequent feedback.

These user defined Loopback Virtual Devices will appear in the Mac OSX Audio MIDI Setup utility. They can be used individually. They can also be combined, thus creating a cumulative (Aggregate) Audio Device. We will utilize both options (individual Virtual Devices for Skype Clients + cumulative Aggregate as the DAW’s default I/O).

Aggregate Device

The image below displays a user defined Aggregate Audio Device created in OSX using the Audio MIDI Setup utility. It is named Skype (Dual) MixMinus. Notice how I’ve selected the Virtual Devices created in Loopback as Subdevices. Also notice how each Subdevice accurately displays input and output I/O mapping for a total of 14 inputs + 14 outputs. This matches the configuration displayed in the I/O Matrix table diagram above. The Aggregate Audio Device is now ready for DAW integration.

DAW Implementation

For this demonstration I will be using Pro Tools with the Skype (Dual) MixMinus Aggregate set as the Playback Engine (it’s default Session I/O). This configuration has also been successfully implemented in Logic Pro X. It has not been tested in Adobe Audition.

The Chanel Strip configuration will be described in sequential order. Please note the described Session configuration is more complex than what is required.

The first 3 Channel Strips (Green) are mono Auxiliary Inputs. Their assigned Inputs are the Host Mic, Skype Client 1, and Skype Client 2. Notice how the assigned inputs match the input configuration as displayed in the I/O Matrix table diagram (1 + 4 + 6).

The Faders on these Channel Strips function as input level controllers for each source input before the signals reach the pre-fader recording tracks.

Two audio plugins are inserted on each Skype Client input Channel Strip (Downward Expander and Limiter). The Expanders will transparently attenuate the inactive input. The Limiters will function as a safeguard thus preventing unexpected signal level overload. Plenty of headroom is maintained. In essence the Limiters will rarely engage.

Tracking Configuration

The outputs of the source input Channel Strips are routed (via virtual Buses) to the inputs of 3 standard mono Audio Channel Strips (Blue). When armed, they will record the source inputs discretely.


The Host Channel contains 2 active Sends passing audio to Bus 1 and Bus 2.
The Skype 1 Channel contains 1 active Send passing audio to Bus 2.
The Skype 2 Channel contains 1 active Send passing audio to Bus 1.


2 additional Auxiliary Input Channel Strips (Purple) receive signal from Send Buses 1 + 2.

Configuration as follows:

• The To Skype-1 input is set to Bus 1. This Bus includes the tapped Host audio and the tapped Skype 2 client audio. It’s output is set to Output 3.

• The To Skype-2 input is set to Bus 2. This Bus includes the tapped Host audio and the tapped Skype 1 client audio. It’s output is set to Output 5.

Notice how the assigned outputs (3 + 5) match the output configuration displayed in the I/O Matrix table diagram.

At this point we’ve created a dual Mix-Minus in the mixer…

* * *

Monitoring and Pan Offset

Pro Tools attenuates center-panned mono tracks according to a user defined Pan Depth setting. My setting is always -3 dB.

Here’s how I reconstitute the attenuation:

Notice the outputs of the Skype 1 and Skype 2 audio tracks are routed to a stereo Bus labeled to Offset. An Auxiliary Input Channel Strip (Green, labeled Mix Offset) receives the audio from the to Offset virtual Bus. I use the Channel Strip fader to add +3 dB of static gain to reconstitute the previously applied attenuation on the passing signal.

The Mix Offset Channel Strip’s output is set to Phones. This signal path represents the Interface Headphone outputs (13+14). They are referenced in the I/O Matrix table diagram.

The Master Fader’s (Yellow) output is also set to Phones. This configuration allows the engineer to monitor the Skype participants via headphones connected to the Motu Interface.

Notice the output for the Host Audio Track is set to Mute Bus. This is an unassigned virtual Bus. The Host Mic input is directly monitored (also via headphones) through the Motu Interface. Setting the Host channel output to the Session’s Phones output Bus will blend the hardware monitored mic signal with the slightly latent Session output. Using the unassigned Bus solves this. Of course in Post the hardware monitored signal will be absent. In this case the output must be reassigned to the Phones output Bus.


In preparation for recording, two independent instances of Skype (using unique accounts) must be launched on the Host System.

My Preferred method:

1) Launch Skype as normal and login to your primary account.

2) In the Skype Preferences/Audio/Video – define the Microphone (input) and Speakers (output) as displayed:

Notice we revert back to independent Virtual Devices created in Loopback for the configuration of this Skype instance. The Host + Skype 2 device is essentially output 3 in the configured DAW. It passes the Host + Skype Client 2 audio to this running instance of Skype.

[Speakers: Skype 1] is mapped to input 4, previously assigned in the DAW’s configured Session.

3) To launch the second instance of Skype – run the OSX Terminal application and execute the following command:

open -na /Applications/ –args -DataPath /Users/$(whoami)/Library/Application\ Support/Skype2

(I created an executable Shell Script that runs the displayed command. Once created, simply double click it’s icon to launch Skype).

A second instance of Skype will launch and prompt you for credentials. Login using your secondary Skype account.

4) In the Skype Preferences for this instance – define the Microphone (input) and Speakers (output) as displayed:

Once again we revert back to independent Virtual Devices created in Loopback for the configuration of this Skype instance. The Host + Skype 1 device is essentially output 5 in the configured DAW. It passes the Host + Skype Client 1 audio to this running instance of Skype.

[Speakers: Skype 2] is mapped to input 6, previously assigned in the DAW’s configured Session.

Recording in the Box

After launching and configuring the Skype instance(s), arm the DAW’s Host, Skype 1, and Skype 2 audio tracks for recording. Connect with the independent Skype participants. Both participants will be able to converse with each other + the Host. Recording the Session will supply discrete audio files for each participant on their respective tracks.

External Recording

In the I/O Matrix diagram you will notice the availability of two sets of stereo outputs (9+10 , 11+12). They represent the Line Outputs and the S/PDIF output on the Motu Interface. Remember the Interface is a Subdevice within the defined Aggregate Device. As a result the noted inputs and outputs are available within the DAW Session for patching.

Also notice the last two Channel Strips (Red) displayed in the Session mixer. They are Auxiliary Input Channel Strips. Their inputs are assigned to the Skype 1 and Skype 2 output Buses. Each Channel Strip output is mapped to corresponding Motu Interface Line Outputs and finally patched to the L+R inputs of an external solid state stereo recorder.

In this particular example only the Skype Participants will be recorded externally. My intension is to engineer Sessions containing two remote clients. In this case it’s a viable solution for out of the box Session recording.


You will notice a few additional Audio Plugins inserted on various Channel Strips. A Mix Bus Compressor and a Limiter are inserted on the Mix Offset Channel Strip.

The Inserts located on the Master Fader are post fader. Here I’ve inserted the Clarity M routing plugin. This passes the signal to an external (hardware) Loudness Meter via USB.

Finally I’ve inserted Limiters on each of the external recorder Buses. Again they are set to maintain maximum headroom, and only exist to prevent unexpected signal level overload before the audio reaches the recorder.

Of course Plugin implementation in general will be subjective.


The complexity of the Session can be customized or even minimized to suit your needs. Basic requirements include a properly configured Aggregate I/O, 3 audio tracks capable of recording, 2 Aux Sends, and a Master Fader. The dual Skype requirement is necessary and straightforward.

It is possible to add support for additional running Skype clients. This will require additional (mono) Loopback Pass-Thru Virtual Devices, and further customization of the Aggregate Audio Device + DAW Session.

I defined custom Incoming Connection Ports for each Skype Instance. This option is available in Skype Preferences/Advanced. Port Mapping was managed in my Router’s configuration utility.

I closely monitored System Resources throughout testing and checked for potential deficiencies. Pro Tools performed well with no issues. Each running instance of Skype displayed less than 14% CPU usage. Memory consumption was equally low. Note my Quad 2.8 GHz Mac Pro has 32 gigs of RAM and four dedicated media drives.

Undoubtedly someone will state this implementation is “much too complicated for the common Podcaster,” or even “Broadcaster.” With respect I’m not necessarily targeting novices. Regardless, you will most certainly require skills and experience in DAW and I/O signal routing.

Please note a Mix-Minus feed in general is not some sort of revelation. It’s pretty basic stuff. You’ll need a full understanding of it as well.

If you have questions I am happy to help. If you would like to participate in a test, ping me. If you are overwhelmed please revert to a service such as Zencastr.


Technorati Tags: , , ,

Utilizing Multiple Outputs for Recording

The vast majority of audio industry professionals use DAWS running on proficient computer systems to record audio directly to secondary hard disks. For some reason direct to disk recording is not widely endorsed in the Podcasting space. Many consultants (for various reasons) advise against this recording method. Instead, they recommend the use of inexpensive hand-held solid state Recorders.

For instance I’ve heard a few people state “computers cause ground loops”, hence the widespread Portable Recorder recommendation. In my opinion that is a half-baked assertion. In fact, ANY electronic component in a signal chain (including your electrical system) is capable of producing inherent noise. Often the replacement of cheaply manufactured components (interfaces, mixers, processors, cables, etc.) will solve audible noise problems. The key is to isolate the source and correct or replace it.

Portable Recorders are well suited for location interviews and video shoots. For in-studio sessions I feel direct to disk recording on a proficient system is much more flexible compared to the use of an external device. More so, the sole use of a Portable Recorder without a proper backup strategy is flat out risky.

That being said I thought I would document a basic Skype Recording session that I implemented in Pro Tools using a multi-output Motu Audio Interface. The incoming audio will be recorded on a secondary hard disk installed (or interfaced) on the host system. The real time session audio will also be routed to an alternate Interface Output, feeding an external Recorder for backup purposes.


Note a multi-output Mixer can be used in place of an Audio Interface. As far as software you can use any modern DAW to replicate the described session. If you are using a Mac, Rogue Amoeba’s distinctive Audio Hijack application is also highly capable.


1-Record Studio Host and Skype Participant on discrete mono tracks in real time.

2-Combine the discrete recordings and create a split-stereo clip with independent dynamics processing applied to each channel, all in real time.

3-Use a Pre-Fader Send to independently control the level of the split-stereo discrete recording, and patch the real time signal to the Interface S/PDIF Output. This will feed the external Recorder’s S/PDIF Input.

4-Monitor the session through Headphones and play out through Desktop near-field Monitors.

Please review the displayed Pro Tools session snapshot.

• The Input for the mono Host track is the Interface connected mic. The Input for the mono Skype track is “Mix 1 Return.” This is an Interface supported feature, allowing the operator to route the computer’s Output (in this case Skype) to an available DAW Input. This configuration effectively creates a mix-minus with discrete, unprocessed recordings on individual mono tracks.

• The mono recording tracks are routed to individual mono Aux Input tracks using Buses. The Aux Input tracks are hard-panned L+R and contain various inserted processing options, including a Gain Trim, Expander, and Compressor.

The processing applied in this session is not intended to replace what would normally occur in post. The Compressors are there just to tame dynamics in the event either participant exceeds nominal input levels. The Expander is set up to apply mild attenuation when the host is not speaking.

• The Aux Input tracks have their Outputs set to a common stereo Bus.

• Finally a third standard stereo audio track (Rec-Sum) uses the stereo Bus Output(s) as it’s Inputs. By hard panning the channels L+R we are able to maintain discrete channel separation within any printed stereo clip.

To record the discrete raw audio and the processed split-stereo audio in real time, we simply arm all session Audio tracks to record and fire away. The session can be monitored through Headphones and played out through near fields via the Main Output.

Secondary Output

The Motu Interface used for this session has a total of 8 Outputs, including a stereo S/PDIF option. I implemented Pre-Fader Send on the session’s Rec-Sum channel with it’s Output set to S/PDIF. This will route the track’s split-stereo audio to the S/PDIF stereo Input of an external Marantz CF Recorder. With the Send designated as Pre-Fader, it’s level control will be independent of the parent (Rec-Sum) channel fader, thus allowing discrete control of the real time signal being fed to the Recorder.

Note in the displayed Pro Tools session snapshot – the floating fader positioned to the left of the mixer is a user friendly and easily accessible copy of the much smaller Send fader displayed in the parent (Rec-Sum) track.

In summary, we can successfully initialize and capture 4 recordings in a single pass: the raw Host audio, the raw Skype participant audio, a split-stereo processed version of the Skype session, and a split-stereo copy of the processed Skype session stored on the Recorder.

The image below displays the completed session with the split-stereo clip playing through the Main Outputs.


My general recommendation:when it is feasible, use direct to disk and Portable recording options in unison on a proficient system to capture in-studio multitrack and single participant Podcast sessions.


Technorati Tags: , ,

Skype, Logic Pro X, and Aggregate Devices …


Studio Host and Skype participant to be recorded inside Logic Pro X on a single machine (single pass) with no additional hardware other than a Mic Input Device.


[– Two independent mono Host/Participant stems with no processing.

[– One processed split-stereo mixdown of the session with the Host and Guest residing on discrete (L+R) channels.

[– Real time Processing and Recording of all instances.


Of course the objectives noted above are easily attainable using two independent machines, with the recording box running Logic Pro X and the Skype machine handling the connection. In this case you would also need to use a mixer to set up a proper mix-minus.

You can also implement similar workflows by using two inexpensive USB audio interfaces connected to a single machine.

Considering the resourcefulness of today’s modern day Macs, I’m confident the following workflow will be successful freeing the user from complexities and added costs.

OSX Aggregate Devices

The foundation of this setup is based on a user created Aggregate Audio Device. Aggregate devices appear in the OSX System Preferences/Sound I/O options for system wide use. By wrapping supported “Subdevices” into a single Aggregate, you effectivly create a sort of cumulative Input Device that can be designated in Logic as the default. We also need a software utility that supports routing of the Skype Output to an Input in Logic.

I originally created this workflow using SoundFlower that was installed on my secondary iMac and carried over form previous versions of OSX. SoundFlower, along with the iMac’s Line Input were wrapped into a single Aggregate Device, and then designated in Logic as the default Input.

This worked well. However, I had no plans to install the now unsupported SoundFlower on my production MacPro for further testing. And so I looked around for a suitable up to date (and actively developed) replacement for SoundFlower.

Sound Siphon

Sound Siphon by Static Z Software “… makes your Mac’s Audio Output available as an Audio Input Device. It enables you to send audio from one application to another where it can be processed, streamed, or recorded.

Exactly what I needed.

Note that Sound Siphon is very diverse in terms of features. And the developer states that many useful enhancements are in the works. You can download a restricted demo. My hope is that you consider purchasing a $29.99 license. This will ensure the longevity of the application and continued development. Note that I have no affilation and I gladly purchased a license.

This is a snapshot of Sound Siphon:


In the example above I display a user defined Device (“Capture Safari”) that is essentially a Custom Audio Input. I then associated the Safari Application with this device. This becomes a system wide option to capture Safari audio. For example QuickTime X will now display “Capture Safari” as an Input option for audio recording.

It’s important to note that this particular Sound Siphon feature is supplemental to the Skype recording implementation. In other words – it’s an entrley different use case scenario. My goal here is to disclose the flexibility of the application.

Creating the Aggregate Device

Input 1 on my Mackie Onyx 1220i Mixer receives the output from a dbx 286A Voice Processor. The studio Mic is connected to the processor for proper gain staging. I needed to wrap the Mic signal along with the Skype audio into a single Input Device and designate it in Logic’s Preferences for proper routing.

To create an Aggregate Device, open Audio MIDI Setup, located in ~/Applications/Utilities. When creating a new Aggregate, supported Subdevices appear in the right side setup table.


Notice that Sound Siphon is listed as a 2 in/2 out device in the left source view. This is created when you install the application. Once installed, it will be available to be wrapped into an Aggregate Device along with pre-existing devices.

For my implementation I created “Skype Tracker” as a new Aggregate and selected my mixer (Onyx-(2528)) and Sound Siphon as Subdevices. Up top you set your Sample Rate and the Clock Source. My system seems to perform better with Sound Siphon set as the Clock Source.

It’s important to review the Input Channel matrix of the new Aggregate Device. Notice that Sound Siphon will only support Input channels (17+18). When routing Inputs in Logic, I will use Input 1 for the studio Mic and Input 17 for Skype.


Here are the Skype settings that I am using:


The Microphone is set to the Aggregate Device. The Speakers option is set to Sound Siphon. This setting is imperative and from what I can tell non-flexiable.

Logic Pro X

The first thing we need to do is define the Input Device in Global Preferences/Audio/Devices. I set mine to the Aggregate Device:


Next we will address setup and routing. What’s important here is that I use an Object in Logic that may not be immediately obvious in your particular installation.

Specifically, I often use Input Channel Strip Objects in my projects. They are implemented in the Environemnt (aka “MIDI Environment”). It is accessible form the Logic Window Menu.

From the Logic Docs regarding Input Channel Strips:

“The Input Channel Strip allows you to directly route and control signals from your audio hardware’s Inputs. Once an Input Channel Strip is assigned to an Audio Channel Strip, it can be monitored and recorded directly into Logic Pro, along with its effect plug-ins.

The signal is processed, inclusive of plug-ins even while Logic Pro is not playing. In other words, Input Channel Strips can behave just like external hardware processors. Aux sends can be used pre- or post-fader.

Input Channel Strips can be used as live Inputs that can stream audio signals from external sources (such as MIDI synthesizers and sound modules) into a stereo mix (by bouncing an Output Channel Strip).”

You can also create Bus Channel Strip Objects in the Environment. They are not the same as Auxiliary Channel Strips and can be quite useful in certain instances. For more information about Bus Channel Strips please refer to this article.

The Environment

To expose the accessability of the Logic Environment, open global Preferences and access the Advanced options. The MIDI option needs to be selected as part of the Advanced Tools:


Once that setting is ticked, “Open Midi Environment” will appear as an option in the Logic Window Menu.

Channel Strip Objects are added to the Environment from the New Menu/Channel Strip. Notice how the Environment emulates the Project Mixer:


Note that when adding Input Channel Strips in the Environment, you must define the corresponding (Aggregate) Device Inputs using the Channel Strip editor:


For this particular project I created two Input Channel Strips in the Environment using Inputs 1 and 17 respectively, based on Aggregate Subdevice availability (Input 1 = Mic, Input 17 = Skype).

You will also need 4 Audio Tracks (2 Mono, 1 Stereo, 1 PreListen), and 2 (Mono) Auxiliary Channel Strips. Create Audio Tracks using the Track/New Tracks option – located in the Logic Application Menu. Add Auxiliary Channel Strips using the Mixer’s Options Menu/Create New … || Note that the Input Channel Strips created in the Environment should be designated Mono.

Here is my Project Mixer with all necessary Objects and Routing:



The reddish labeled channels are the two Input Channel Strips that I created in the Environment. If you look at the text at the very top of these Channel Strips, you will see their Input designations.

The signals coming in through the Inputs are routed to their own independent Aux Channels for processing. Notice I inserted a Gain Trim on the Mic Input Channel. All processing options are of course subjective. One example would be to insert two instances of a Compressor on each Aux Channel. You would set these up to apply real time, non-aggresive dynamic range compression as you record.

Moving forward – notice the Aux Channels are Mono and hard panned L+R respectivly. This will maintain channel separation when recording the split-stereo version of the session. In this example each Aux Channel Output is routed to Audio Channel 3 (“Split Record”). This Stereo Audio Track is panned center. When armed it will record the Aux Channel Outputs to a split-stereo file.

Also study how I set up the remaining Audio Tracks – Audio Track 1 (“Rec. Mic”) and Audio Track 2 (“Rec. Skype”). Their Inputs are set to Bus 1 and 2 respectively, allowing these tracks to receive the unprocessed Outputs (“dry” audio) from the Input Channel Strips.

Keep in mind that if Effects are inserted on the Input Channel Strips, the audio routed to Audio Tracks 1+2 will be processed. In most cases I would not insert any Effects on the Input Channel Strips other than Gain. My intension here is to record dry stems.

I Grouped various aspects of these two channels, mainly Volume, Mute, Solo, and Record. This will link the faders and make it easy to control audibility of the mono stems cumulatively.

Wrap Up

That’s basicilly it. You can record/monitor all tracks in real time. And when you are done, there is no need to bounce, although you still can. You simply “Export” or “Export Region” as an individual file(s).



You may have noticed the Outputs for the Auxiliary Channel Strips (1+2) and the Input for Audio Track 3 (“Split Record”) is Bus 3. This is in fact a virtual (permanent) Bus used to route the processed audio to Track 3 for recording.

When you select a permanent virtual Bus in Logic for routing, an Auxiliary Channel Strip is auto-created and will appear in the Mixer. For this particular workflow – we use two Auxiliary Channel Strips, one for Mic processing and a second for Skype processing.

Throughout this entire workflow no changes were made to my default OSX Audio I/O Settings located in System Preferences/Sound.

As I always say – Audio Tracking and Post are highly subjective arts. In fact many Logic “experts” have never heard of or utilized the options in the Environment. And your processing options are also subjective. My hope is this documentation will at the very least introduce you the creation and usage of Aggregate Devices.

If by chance you develop a successful alternative solution, all well and good. In my tests I’ve found the documented implementation to work quite well.

Let me know if you have any questions.

I’d like to thank my friend Victor Cajiao for his help while testing this workflow.


Technorati Tags: , ,