Controlling strength of `ApplyImpulseResponse` #388

worldveil · 2025-04-21T06:33:47Z

When a ApplyImpulseResponse gets applied, it is sometimes quite strong. To the point you can't really hear the original audio.

The silliest thing I can think of is doing some kind of min|max_snr_db argument pair, sampling that target SNR, and then taking the dry (original) and wet (convolved) signals and adding them together in such a way the SNR is satisfied.

What do you think @iver56 ?

The text was updated successfully, but these errors were encountered:

iver56 · 2025-04-21T07:09:47Z

That's a legit question! I imagine two ways of getting less prominent perturbations:

1. Mix the input audio with the output audio (as you suggested)
This could be done (I have thought about it before) with a wrapper class that inputs the transform instance and the output amount (as a fraction) that you want in your mix. This class is not implemented yet.

Note that in the case of ApplyImpulseResponse, the input audio and the output audio are typically not time-aligned, depending on the chosen RIR, so you might end up getting unexpected/unwanted coloration artifacts (comb!) or even flanging. In other words, I would not recommend this approach in your case.

2. Change the RIRs
A more realistic-sounding solution to your problem is to use less extreme RIRs. One way to achieve that is to massage your dataset of RIRs, e.g. by removing long RIRs or by modifying them (e.g. taper the end of it somehow). Alternatively, you can find a different dataset of RIRs.

I guess it could be possible to do any kind of RIR modification on the fly, but it would be for advanced users. Maybe I could add a rir_transform argument that is a callable that can modify the RIR before it gets used. What do you think?

worldveil · 2025-04-21T07:24:17Z

RE: (1) time alignment is not necessarily the goal, diversity of the training set is :)

I may just do this more manually, but do you have an example in the code of wrapper class that inputs the transform instance so that if I do find it works well, the pattern is likely PR'able back to the repo?

iver56 · 2025-04-21T07:40:16Z

You could have a look at PostGain, a class that I've been toying with, but which is not officially released/exposed yet: https://github.com/iver56/audiomentations/blob/main/audiomentations/core/post_gain.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Controlling strength of `ApplyImpulseResponse` #388

Controlling strength of `ApplyImpulseResponse` #388

worldveil commented Apr 21, 2025

iver56 commented Apr 21, 2025 •

edited

Loading

worldveil commented Apr 21, 2025

iver56 commented Apr 21, 2025

Controlling strength of ApplyImpulseResponse #388

Controlling strength of ApplyImpulseResponse #388

Comments

worldveil commented Apr 21, 2025

iver56 commented Apr 21, 2025 • edited Loading

worldveil commented Apr 21, 2025

iver56 commented Apr 21, 2025

Controlling strength of `ApplyImpulseResponse` #388

Controlling strength of `ApplyImpulseResponse` #388

iver56 commented Apr 21, 2025 •

edited

Loading