Add MoE TEGroupedMLP support for Minitron Pruning#1038
Add MoE TEGroupedMLP support for Minitron Pruning#1038kevalmorabia97 wants to merge 1 commit intokmorabia/minitron-full-te-specfrom
Conversation
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## kmorabia/minitron-full-te-spec #1038 +/- ##
==================================================================
- Coverage 70.11% 70.08% -0.03%
==================================================================
Files 221 221
Lines 25459 25499 +40
==================================================================
+ Hits 17851 17872 +21
- Misses 7608 7627 +19 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Add support for MoE TEGroupedMLP (
moe_grouped_gemm=True) which is supposed to be much more efficient than current SequentialMLPSince kernels are different, pruned model may differ slightly
Testing
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: N/AAdditional Information