【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss #3988

cchenhaifeng · 2025-02-14T09:36:19Z

PR types

New features

PR changes

APIs

Describe

PaddlePaddle/community#1062

paddle-bot · 2025-02-14T09:36:25Z

Thanks for your contribution!

CLAassistant · 2025-02-14T09:36:27Z

All committers have signed the CLA.

cchenhaifeng · 2025-02-14T15:08:48Z

这是我在本地测试过后，拿到的数值结果，请检查

cchenhaifeng · 2025-02-15T04:49:31Z

原先的代码仓是不是出问题了？我并没有修改过这里的代码 @zxcd @luotao1

zxcd · 2025-02-17T03:05:57Z

paddlespeech/t2s/modules/losses.py

+        "DDSP: Differentiable Digital Signal Processing."
+        International Conference on Learning Representations. 2019.
+
+    Implementation copied from: https://github.com/descriptinc/lyrebird-audiotools/blob/961786aa1a9d628cca0c0486e5885a457fe70c1a/audiotools/metrics/spectral.py


why don't use develop branch?

zxcd · 2025-02-17T03:20:31Z

tests/unit/tts/test_losses.py

+
+
+def test_multi_scale_stft_loss():
+    a = np.linspace(0, 199.98, 10000)


suggest use true wav to test loss.

zxcd · 2025-02-17T03:20:52Z

tests/unit/tts/test_losses.py

+
+
+def test_sisdr_loss():
+    a = np.linspace(0, 199.98, 10000)


same with above

zxcd · 2025-02-17T03:21:02Z

tests/unit/tts/test_losses.py

+        def __call__(self, x):
+            return x * (-0.2)
+
+    a = np.linspace(0, 199.98, 10000)


same with above

zxcd · 2025-02-17T03:21:49Z

paddlespeech/t2s/modules/losses.py

+    weight : float, optional
+        Weight of this loss, defaults to 1.0.
+
+    Implementation copied from: https://github.com/descriptinc/lyrebird-audiotools/blob/961786aa1a9d628cca0c0486e5885a457fe70c1a/audiotools/metrics/distance.py


same with above

paddlespeech/t2s/modules/losses.py

tests/unit/tts/test_losses.py

zxcd · 2025-02-17T03:27:22Z

paddlespeech/t2s/modules/losses.py

+            pow: float=2.0,
+            weight: float=1.0,
+            match_stride: bool=False,
+            window_type: str=None, ):


when default value is None, suggest use Optional type

zxcd · 2025-02-17T08:03:02Z

paddlespeech/t2s/modules/losses.py

+
+    def __init__(
+            self,
+            scaling: int=True,


int=True seems somewhat ambiguous

zxcd · 2025-02-17T08:14:56Z

paddlespeech/t2s/modules/losses.py

+                x: Union[AudioSignal, paddle.Tensor],
+                y: Union[AudioSignal, paddle.Tensor]):
+        eps = 1e-8
+        # nb, nc, nt


if you mean tensor shape, suggest use (B, C, T)

I have modified it as requested. Please check it.

zxcd · 2025-02-17T08:19:40Z

Are you interested in participating in the training of the DAC model (Hackathon 8th No.5) as well?

cchenhaifeng · 2025-02-19T08:42:47Z

Are you interested in participating in the training of the DAC model (Hackathon 8th No.5) as well?

我看一下？

cchenhaifeng · 2025-02-20T04:42:04Z

任务五似乎是要先把任务9完成才能做的？那我是先等任务9合进去再开始嘛？ @zxcd

luotao1 · 2025-02-20T06:14:15Z

是的，请先完成任务9

cchenhaifeng · 2025-02-20T10:41:25Z

是的，请先完成任务9

啊，任务9已经完成可以检查了的。

luotao1 · 2025-02-20T11:36:30Z

任务9已经完成可以检查了的。

并不知道你都改完了，上面的评论请都回复下，请看下 Code Review 注意事项。

zxcd · 2025-02-20T12:44:03Z

tests/unit/tts/test_losses.py

+
+
+def get_input():
+    x = np.array([


if audio file not have special request, you can use our demo audio, such as https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav or https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh

If you have special request, pin the audio link to me, I will upload it to server.

if audio file not have special request, you can use our demo audio, such as https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav or https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh

If you have special request, pin the audio link to me, I will upload it to server.

It doesn't have a file for the test loss function, so I'm using the wav file you provided. Now that I've finished revising, please check

tests/unit/ci.sh

cchenhaifeng · 2025-02-21T02:48:38Z

我觉得应该找一下做audiotools这个pr的人复核一下，他的导包似乎有问题。 @luotao1

luotao1 · 2025-02-21T03:34:56Z

可以进如流群讨论

我觉得应该找一下做audiotools这个pr的人复核一下，他的导包似乎有问题

也请 @DrRyanHuang 有空看下

DrRyanHuang · 2025-02-21T04:23:11Z

我觉得应该找一下做audiotools这个pr的人复核一下，他的导包似乎有问题。

@cchenhaifeng
Currently, since audiotools exists independently within PaddleSpeech, its installation and testing are also conducted separately. If you wish to use audiotools, you can install it yourself by following the requirements specified in paddlespeech/audiotools/requirements.txt.

This is demonstrated in the testing of audiotools:
https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh#L1

I previously encountered the issue of cyclic imports as well. In this PR, you can fix it using deferred imports; I see that you’ve already done so. If there are any other issues, we can communicate at any time.

cchenhaifeng · 2025-02-21T05:12:05Z

我觉得应该找一下做audiotools这个pr的人复核一下，他的导包似乎有问题。

@cchenhaifeng Currently, since audiotools exists independently within PaddleSpeech, its installation and testing are also conducted separately. If you wish to use audiotools, you can install it yourself by following the requirements specified in paddlespeech/audiotools/requirements.txt.

This is demonstrated in the testing of audiotools: https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh#L1

I previously encountered the issue of cyclic imports as well. In this PR, you can fix it using deferred imports; I see that you’ve already done so. If there are any other issues, we can communicate at any time.

Okay, I'll get back to you if I have any other questions

cchenhaifeng · 2025-02-21T06:13:00Z

我觉得应该找一下做audiotools这个pr的人复核一下，他的导包似乎有问题。

@cchenhaifeng Currently, since audiotools exists independently within PaddleSpeech, its installation and testing are also conducted separately. If you wish to use audiotools, you can install it yourself by following the requirements specified in paddlespeech/audiotools/requirements.txt.

This is demonstrated in the testing of audiotools: https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh#L1

I previously encountered the issue of cyclic imports as well. In this PR, you can fix it using deferred imports; I see that you’ve already done so. If there are any other issues, we can communicate at any time.

Is it normal for the test to get stuck here all the time?

DrRyanHuang · 2025-02-21T06:31:03Z

Is it normal for the test to get stuck here all the time?

Well, it's not normal. Does the code pass the unit tests for utils.py on your local machine? If it does, then the issue might be with the CI system.

cchenhaifeng · 2025-02-21T06:40:57Z

Is it normal for the test to get stuck here all the time?

Well, it's not normal. Does the code pass the unit tests for utils.py on your local machine? If it does, then the issue might be with the CI system.

I haven't tried this, I'm testing test_losses.py , but I'm just modifying the packet method, not the logic diamagnetic, and theoretically it should pass.

cchenhaifeng · 2025-02-21T08:06:20Z

Is it normal for the test to get stuck here all the time?

Well, it's not normal. Does the code pass the unit tests for utils.py on your local machine? If it does, then the issue might be with the CI system.

The local results are correct

DrRyanHuang · 2025-02-21T08:15:11Z

The local results are correct

The CI environment uses:

Python 3.8
paddlepaddle-gpu==2.5

Have you aligned your PaddlePaddle version with the CI environment?
We need to support both Paddle 2.5 and Paddle 2.6.

cchenhaifeng · 2025-02-21T08:25:58Z

The local results are correct

The CI environment uses:
* Python 3.8

* paddlepaddle-gpu==2.5
Have you aligned your PaddlePaddle version with the CI environment? We need to support both Paddle 2.5 and Paddle 2.6.

My Python is 3.12

cchenhaifeng · 2025-02-21T08:44:55Z

The local results are correct

The CI environment uses:
* Python 3.8

* paddlepaddle-gpu==2.5
Have you aligned your PaddlePaddle version with the CI environment? We need to support both Paddle 2.5 and Paddle 2.6.

No problem locally

cchenhaifeng · 2025-02-21T10:03:16Z

All the test code has been verified, I'm getting it to run again, please check @zxcd

cchenhaifeng · 2025-02-21T10:48:18Z

All the test code has been verified, please check @zxcd

paddlespeech/__init__.py

zxcd · 2025-02-24T03:31:50Z

paddlespeech/audiotools/core/__init__.py

@@ -26,3 +24,5 @@
 from .audio_signal import AudioSignal
 from .audio_signal import STFTParams
 from .loudness import Meter
+from paddlespeech.t2s.modules import fft_conv1d


Unified use of relative paths or absolute paths in one file

zxcd · 2025-02-24T03:38:28Z

tests/unit/ci.sh

@@ -1,6 +1,7 @@
 function main(){
  set -ex
  speech_ci_path=`pwd`
+  pip install ffmpeg flatten_dict ffmpy


test_audiotools.sh will install these pkg, I think this don't need to add it twice?

zxcd · 2025-02-24T03:38:59Z

tests/unit/tts/test_losses.py

@@ -0,0 +1,61 @@
+# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.


zxcd · 2025-02-24T03:40:20Z

tests/unit/tts/test_losses.py

+    x, y = get_input()
+    loss = MultiScaleSTFTLoss()
+    pd_loss = loss(x, y)
+    np.allclose(pd_loss.numpy(), 7.5622)


Can the accuracy be aligned to 1e-6?

zxcd · 2025-02-24T03:44:35Z

paddlespeech/t2s/modules/losses.py

+        noise = (e_res**2).sum(axis=1)
+        sdr = -10 * paddle.log10(signal / noise + eps)
+
+        if self.clip_min is not None:


whether should also add self.clip_min != 'None'?

paddlespeech/audiotools/core/util.py

cchenhaifeng · 2025-02-24T13:57:31Z

ci It has all passed, and there will be errors in the loop of packet routing, so the method of delaying packet routing is used. @zxcd

zxcd · 2025-02-25T04:04:54Z

tests/unit/tts/test_losses.py

+    x, y = get_input()
+    loss = MultiScaleSTFTLoss()
+    pd_loss = loss(x, y)
+    np.allclose(pd_loss.numpy(), 7.562150, rtol=1e-06)


Can you put the code generated using audiotools for 7.562150 in the note for easy verification?

Where are the code comments?

cchenhaifeng · 2025-02-25T05:41:07Z

As you can see, the test_util is in a loop after the deletion. @zxcd

zxcd · 2025-02-25T09:56:17Z

tests/unit/tts/test_losses.py

+    x, y = get_input()
+    loss = GANLoss(My_discriminator0())
+    pd_loss0, pd_loss1 = loss(x, y)
+    np.allclose(pd_loss0.numpy(), -0.102722, rtol=1e-06)


use self.assertEqual or assert include np.allclose

use self.assertEqual or assert include np.allclose

Done

zxcd

LGTM

luotao1 · 2025-03-04T02:12:49Z

hi, @cchenhaifeng

非常感谢你对飞桨的贡献，我们正在运营一个PFCC组织。PFCC是飞桨开源的贡献者俱乐部，只有给飞桨合入过代码的开发者才能加入，俱乐部里每两周会有一次例会（按兴趣参加），也会时不时办线下meetup面基，详情可见 https://github.com/luotao1 主页说明。
如果你对PFCC有兴趣，请发送邮件至 [email protected]，我们会邀请你加入~

paddle-bot bot added the contributor label Feb 14, 2025

mergify bot added the T2S label Feb 14, 2025

mergify bot added the Test label Feb 14, 2025

luotao1 assigned luotao1 and zxcd Feb 17, 2025

zxcd reviewed Feb 17, 2025

View reviewed changes

luotao1 mentioned this pull request Feb 18, 2025

【Hackathon 8th】开源贡献个人挑战赛（尝鲜版） PaddlePaddle/Paddle#70746

Closed

zxcd reviewed Feb 20, 2025

View reviewed changes

tests/unit/ci.sh Show resolved Hide resolved

cchenhaifeng force-pushed the develop branch from e1c9ed6 to 0afc86d Compare February 20, 2025 15:47

add DAC loss

d92c45e

cchenhaifeng force-pushed the develop branch from 413c466 to d92c45e Compare February 21, 2025 05:10

cchenhaifeng added 2 commits February 21, 2025 16:55

fix bug

f1bc659

fix codestyle

1ed7784

zxcd reviewed Feb 24, 2025

View reviewed changes

cchenhaifeng added 2 commits February 24, 2025 13:53

fix codestyle

d58286d

fix codestyle

2862ae5

zxcd reviewed Feb 25, 2025

View reviewed changes

cchenhaifeng added 2 commits February 25, 2025 12:35

fix codestyle

50e4f4e

fix codestyle

fef33fb

zxcd reviewed Feb 25, 2025

View reviewed changes

fix codestyle

4006864

zxcd approved these changes Feb 26, 2025

View reviewed changes

zxcd merged commit d7bf915 into PaddlePaddle:develop Feb 26, 2025
5 checks passed

zxcd mentioned this pull request Feb 27, 2025

【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss #3954

Closed

GreatV mentioned this pull request Feb 28, 2025

PaddleSpeech 1.5.0 Release Note #3996

Open

luotao1 mentioned this pull request Mar 5, 2025

【Hackathon 8th】开源贡献个人挑战赛 PaddlePaddle/Paddle#71310

Open



		def test_multi_scale_stft_loss():
		a = np.linspace(0, 199.98, 10000)

		@@ -0,0 +1,61 @@
		# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss #3988

【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss #3988

Conversation

cchenhaifeng commented Feb 14, 2025 • edited Loading

PR types

PR changes

Describe

paddle-bot bot commented Feb 14, 2025

CLAassistant commented Feb 14, 2025 • edited Loading

cchenhaifeng commented Feb 14, 2025

cchenhaifeng commented Feb 15, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zxcd commented Feb 17, 2025

cchenhaifeng commented Feb 19, 2025

cchenhaifeng commented Feb 20, 2025

luotao1 commented Feb 20, 2025

cchenhaifeng commented Feb 20, 2025

luotao1 commented Feb 20, 2025

Choose a reason for hiding this comment

cchenhaifeng Feb 20, 2025 • edited Loading

Choose a reason for hiding this comment

cchenhaifeng commented Feb 21, 2025

luotao1 commented Feb 21, 2025

DrRyanHuang commented Feb 21, 2025 • edited Loading

cchenhaifeng commented Feb 21, 2025

cchenhaifeng commented Feb 21, 2025

DrRyanHuang commented Feb 21, 2025

cchenhaifeng commented Feb 21, 2025 • edited Loading

cchenhaifeng commented Feb 21, 2025

DrRyanHuang commented Feb 21, 2025

cchenhaifeng commented Feb 21, 2025

cchenhaifeng commented Feb 21, 2025

cchenhaifeng commented Feb 21, 2025

cchenhaifeng commented Feb 21, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cchenhaifeng commented Feb 24, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cchenhaifeng commented Feb 25, 2025

zxcd Feb 25, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zxcd left a comment

Choose a reason for hiding this comment

luotao1 commented Mar 4, 2025

cchenhaifeng commented Feb 14, 2025 •

edited

Loading

CLAassistant commented Feb 14, 2025 •

edited

Loading

cchenhaifeng commented Feb 15, 2025 •

edited

Loading

cchenhaifeng Feb 20, 2025 •

edited

Loading

DrRyanHuang commented Feb 21, 2025 •

edited

Loading

cchenhaifeng commented Feb 21, 2025 •

edited

Loading

zxcd Feb 25, 2025 •

edited

Loading