High Fidelity Remote Communication

Written on September 30, 2021 & updated on November 10, 2021.

17 min. read

Before the pandemic even started, remote work was already on the rise around the world because it makes sense. Knowledge work doesn’t need to depend on crammed, loud, and unhealthy open offices located in expensive areas devoid of affordable housing. Bringing a laptop to a room inside of a building you spend hours to travel to every day is the definition of absurdity.

Instead working remotely has become almost normal, except to corporate leaders who prefer portraying it as “working from home” as if you need to be grateful for the (temporary) perk. This patronizing butts-in-seats mentality assumes that you’re not, you’re working from home. Perhaps you’re really doing laundry or running errands, but not actually “working”. This ignores the fact that work is already remote by nature when it happens on the Internet. Large organizations effectively have remote offices. They coordinate work across cities and countries. Remote work is here, it’s just not evenly distributed.

Remote Shell Shock

One of the first complaints from first-time remote workers in the spring of 2020 when the pandemic led to an abrupt exodus from shared offices all over the world: it’s hard to communicate remotely.

No shit.

It’s hard to communicate. Period. It was already hard for every remote worker who had to communicate with clustered workers crammed in loud conference rooms, often in blatant violation of fire codes. It’s only now more obvious to everyone how difficult it was for the remote workers to communicate before.

Picture of more than twenty people crowded together in the back of an office near an iMac sporting a tiny webcam in the foreground.
A whole-company standup meeting at Code School's office back in December 2013, featuring in the bottom right what would soon become my eyes and ears.

I was one of the first fully remote workers at my job back in 2013. One month I was happily commuting at least three days a week to our Orlando office, the next I was planning a move back to France. I set up shop as an independent contractor in my 30 square meters (330 sq ft) studio in Paris, six timezones away. My manager at the time was clear: be loud about any friction you encounter, we want to make this work and you’re going to be the canary in the coal mine.

Bad Meeting Habits

Quickly, I realized our stand-up meetings where quite awful remotely. A dozen people standing in a circle in a large open space, talking in turn about what they’re working on, doesn’t quite work. Especially with distant webcam and a weak microphone. Soon, folks volunteered to pass a laptop around so I could hear better. We worked on our own laptops all day long but somehow crowded around one to give each other updates. It’s obvious in hindsight that too many people were involved, but it was easy to overlook the issue in person. We soon reduced the size of teams which needed to share frequent updates. I partially credit remote communication for accelerating this realization. Standups later became video calls were each person used their own machine, making everyone as visible and audible, wherever they worked.

It’s hard to say that I thrived working 7253 kilometers away from my co-workers, but somehow, I managed. I enacted a personal policy that has persisted ever since, sometimes I surmise to the annoyance of my more microphone or camera shy co-workers: any non-binary (yes or no) answer should lead to an audio or video call so that context can be provided much faster than through back and forth text-based chat. Any demonstration should be done over screensharing or recorded (and edited) screencast, not with lazily put together step-by-step instructions. That’s because there are always missing steps, and you never identify them when you’re writing them down. You waste other people’s time instead.

The Before Times

Profile picture of the author using Sony MDR 7506 monitor headphones and the Shure SM7B dynamic microphone.

By definition being remote means not being there. But feeling present goes a long way. A simple look can trigger a strong reaction and a sense of shared understanding. A slight change in intonation can convey doubt or excitement better than a paragraph. Cameras can’t magically make your expressions visible when light isn’t bouncing off your face. Backlighting or contre-jour for example is a very common mistake that I see very smart people make over and over again, even during important video calls featuring very important people you’d assume would have staff to assist them.

When I moved back to Paris in 2014, I purchased my first Logitech C920 720p webcam. Since I was also co-hosting a podcast at the time, I did some microphone research and bought an absurdly expensive but oh so great Shure SM7B microphone, a Scarlett 2i2 XLR to USB interface (to convert the analog signal to USB) and a cheap pre-amplifier. This setup alone allowed me to stand out and be often seen and heard better than many of my in-office co-workers who crowded together in conference rooms and open spaces.

Sensors Aren’t Eyes and Ears

Comparing the audio waveforms of three different microphones: the 2019 16-inch MacBook Pro, the Logitech Brio webcam, and the Shure SM7B.
The flatter the audio output of a microphone, the less lifelike you will sound.

Still, being so remote was challenging. I didn’t know how to set up the audio interface properly. I mistakenly held out on purchasing a good set of studio headphones thinking I had a sense of my own voice’s volume. But a microphone, like a camera, doesn’t have a human perspective on what loud means. It will blast your co-worker’s ears off or sound like you’re far away. If you’re lucky someone will complain that you’re heard to understand. Most people won’t bother. Don’t use a microphone with live feedback without monitoring headphones. The pros do it for a reason.

Your typical headphone microphones don’t count. You’ll only hear other people’s voices when you wear them, not your own. This is even worse with noise cancellation, which gives you less awareness of your own voice’s volume. Even if your voice is a the appropriate level in your environment, you’d be surprised how differently you sound depending on what microphone you’re using and how far away it is from your face.

Comparing Microphone Outputs

Here are three radically different microphones recording the exact same input albeit at different distances from my voice:

2019 16-inch MacBook Pro (60 cm from face)
Logitech Brio webcam (50 cm from face)
Shure SM7B (10 cm from face)

Here is a longer demo of the RODE NT-USB microphone where I to demonstrate how useful an articulated boom arm is:

RODE NT-USB (10 cm from face)

And the cheapest microphone I’ve tested, the Samson Q2U is impressive and like the Rode can work with USB alone. But it can also support an analog XLR to USB interface which can allow you to push the gain (received input volume of the mic) higher and likely get cleaner output as well depending on your audio interface.

Samson Q2U (10 cm from face over USB)
Samson Q2U (10 cm from face over XLR via Scarlett 2i2 interface)

As a bonus, here’s are some popular Apple mobile devices frequently used to send audio and video but don’t fare particularly well even when recording directly on-device with the Voice Memos app:

Apple Airpod Max (on your face)
Apple iPhone 12 Mini (close to your face)
Apple 2019 iPad Pro (30 cm from your face)

I think these demonstrate how much more present you can sound with a better microphone. I talk in a bit more detail about this and microphone technique in a previous post.

Face Time

Screenshot of a fully-remote standup with three people each on their own webcams.
A Code School Platform team standup from December 2017, finally fully remote.

Now let’s talk about your face. Apple did something quite meaningful with FaceTime. They put the onus on precisely what makes you miss your family and friends: their face. Not where they happen to be at the time you call them, or the broad context of what’s around them in a horizontal view, but their vertical portrait. Somehow, Apple still manages to produce webcams that are as bad as they were a decade ago. Meanwhile, Apple makes some of the best selfie phone cameras in the world.

Screen capture of the output of a Logitech C920 webcam shot with ambient lighting behind my monitor.
Cropped Logitech C920 output lit with an Ikea lamp & soft white LED back in 2015.

I started out with a cheap and reliable webcam: the Logitech C920. It’s from 2012 and outputs only 720p but for nearly a decade this webcam was basically the best out there. Especially given limitations in bandwidth. Later on in the 2010s, webcams manufacturers introduced full HD or 1080p resolution, and eventually 4K. It’s still arguably too much for just showing a small face on a screen. As photo cameras became all about more megapixels, so did webcams. Focusing on raw output size over output quality, especially in low light.

Compared to camera sensors now common in mobile phones, webcams are a decade behind. My friend Justin Searls found a way to use an old iPhone as a webcam and I completely get it. It’s far more practical than the solution I arrived at just before the pandemic: using a Sony A6000 mirrorless camera with an expensive 50mm lens and an Elgato Cam Link 4K acquisition card so I can use a sensor and lens combo no webcam maker can compete with. The strange video you see at the top of this post was filmed with this setup. One that I actively recommended against to any fellow remoter. Particularly folks who aren’t into photography or videography. It’s cumbersome, complex, and requires constant fidgeting to keep the camera on, obtain a consistent color temperature, or prevent automatic focus hunting due to shallow depth of field. Plus you often have to replace the camera’s battery with an adapter so you don’t run out of juice in the middle of a meeting.

Two long years into this pandemic, one of the few companies that seems to have grasped the importance of a quality sensor and lens combo is Elgato, with their (thankfully mic-less) Facecam. But its output is too wide by default and according to Elgato’s own GM it’s best to tweak exposure manually (at least for now). Logitech has been on top of the webcam business for years, and their best offering is the Logitech Brio. A decent camera sensor attached to a overly wide lens better suited for YouTubers than remote workers whose face should be the sole focus, not their fancy backdrops. You can force the 4K Logitech Brio to crop most of the background it defaults to showing so you can display what truly matters — your face — but it takes some futzing with settings which should be unnecessary.

If you watch the above demo I’m curious if you’ll wonder like me why multi-lens setups are so common in modern phones but don’t exist in any webcam. The wide angle default is only appropriate to a minority of webcam users (streamers) while most people would benefit from a narrow 35mm to 50mm lens equivalent that would focus on their face instead of their surroundings.

Generally speaking, I think the Logitech Brio is the best solution for most people given adequate lighting and restricting yourself to the standard non-widescreen mode (I think it’s the default but you can adjust it with the Camera Settings app Logitech provides).

2019 MacBook Pro vs. Logitech Brio w/ Elgato Key Light Air (click for full-res)

Hardware Recommendations

Here’s a list of gear I recently recommended as an alternative to my own unwieldy custom setup. The minimum budget is more than double the common “$300 remote stipend” reluctantly relinquished by most companies, while they happily purchase snacks, ergonomic chairs, networking equipment, as well as the typical utility and office leasing costs required for in-office workers. This should give you pause.

This kit is one of the simplest to use and most reliable you’ll likely deal with. Yes, you can get find cheaper microphones and cameras if you sacrifice what I believe are essential features:

  • flicker-free consistent lighting (no headaches or artifacts)
  • croppable video output that focuses on your face, not the room
  • narrow pickup microphones with live headphone monitoring
  • headphones to hear yourself and avoid noise feedback loops
Focus Brand Model Price
Lighting Elgato Key Light Air $180
Face — Option 1 Logitech Brio $200
Face — Option 2 Elgato Facecam $200
Voice — Option 1 Samson Q2U $70
Voice — Option 2 RODE NT-USB $170
Voice — Option 3 AudioTechnica AT2005USB1 $80
Mic Distance RODE PSA1 $100
Ears Sony MDR 7506 $100
Minimum Total     $650

This list is by no means exhaustive, and yes it’s very different than what I’ve recommended in the past because the world of remote gear is evolving at last. I’d warn you against integrated solutions (all-in-one lighting, camera, microphone) but it’s possible a company like Elgato will come out a solution that does it all pretty well in the future. The most expensive single component is lighting which might surprise you, and you may find cheaper alternatives using LED work lamps with color temperature control but I haven’t tried those out and would only encourage you to get a precise maximum lumen brightness output number before you settle for one. The most powerful LED work lamp I found on Amazon maxed out at 600 lumen, while the Key Light Air goes up to 1400. You won’t need all that brightness but in my experience 800 to 1000 lumen is the sweet spot in most environments.

The one-stop-shop for remote communication gear doesn’t exist quite yet, but even if some of the items listed here you’d communicate remotely with higher fidelity than the large majority of office workers worldwide did before the pandemic. While your three-dimensional presence will never be replaceable, it’s possible for two-way communication to have an unprecendented amount of subtlety.

Remote Presence

Photograph of my absurdly over-engineered remote worker desk.

I’ve had a much more intricate setup than the one I recommend above since February 2020. I was preparing to author a video course for Pluralsight and wanted to offer students the best possible learning experience. In countless meetings since, often involving leaders far above my paygrade, it’s impossible to count the number of times someone who matters noticed my facial expressions and asked me to share my thoughts, or reached out to me in DMs afterwards to learn about my setup.

I’ll leave you with this unfair example of a quick video demo of my custom setup which I shot a few months ago during a time of the day where the typical webcam is easily drowned out by backlighting especially without some bright and diffused lighting pointed at your face to compensate. I’m being caricaturally animated (although not that much for me) to highlight how much of my tone, facial expressions, and overal feelings you can perceive from this video. This is not edited in any way other than to make the file smaller and more compressed for easier playback on the web. Granted your Internet bandwidth is sufficient (and that’s a big if) and your conferencing software of choice doesn’t overly compress audio and video you’d likely experience something similar on the other end of a call.

Why can't most webcams convey your presence this well? Your phone does.

An easily overlooked issue when discussing putting your face on camera is that some folks may not be comfortable or even willing to be seen inside their remote working location (home or otherwise). Especially in larger meetings where they don’t expect to intervene. I don’t think it’s anyone’s duty to always have a high quality camera on. That kind of visibility is clearly not comfortable for everyone but I think within reason — when not using chat or asynchronous messaging — it’s extremely valuable for all parties.

It’s the responsibility of employers to deploy the kind of budgets already allocated toward in-office communication to remote work equipment. It’s also the role of folks like me (and you) to help educate IT departments and business leaders on hardware solutions that already exist today.

It has become quite absurd to argue that remoteness has to mean becoming a less visible and valued contributor to your organization. I hope this post can help you convince anyone who might still believe that communicating remotely still has to be a pain.

  1. Although cheaper than the Rode, the Audio-Technica microphone doesn’t come with a pop filter unlike the RODE NT-USB, but you can thankfully pick one of those up for fairly cheap and mount it on the microphone boom arm. The Samson does come with a windscreen which will reduce popping sounds but not quite as much as a pop filter might although I found it sufficient.