Skip to content

[Feature Request] Add Liger CE Loss #2692

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pbontrager opened this issue May 7, 2025 · 2 comments · May be fixed by #2741
Open

[Feature Request] Add Liger CE Loss #2692

pbontrager opened this issue May 7, 2025 · 2 comments · May be fixed by #2741
Labels
community help wanted We would love the community's help completing this issue

Comments

@pbontrager
Copy link
Contributor

Add a new loss in the cross_entropy_loss.py file that inherits from SFT loss but calls the Liger fused_linear_cross_entropy loss. It will need to handle if the input is a DTensor and convert it before calling the liger loss.

Edge Case: if the model output is a tied embedding and TP sharded (DTensor). Then either we'll have to unshard and then reshard the weight every step, or throw an error for that case. (This assumes that liger losses don't work with sharded weights)

A good validation of this feature would be to see if this loss even further improves the numbers here over compiled linear cross entropy loss.

@pbontrager pbontrager added the community help wanted We would love the community's help completing this issue label May 7, 2025
@mananchawla2005
Copy link

I wanna work on this one

@joecummings
Copy link
Contributor

I wanna work on this one

Thanks, feel free to take it on! Please tag me and @pbontrager on the PR.

@mananchawla2005 mananchawla2005 linked a pull request May 16, 2025 that will close this issue
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community help wanted We would love the community's help completing this issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants