Skip to content

feat: Implement SME fast mode kernels#1283

Open
DavidMansell wants to merge 1 commit intomainfrom
pr/add-sme1-fast-mode
Open

feat: Implement SME fast mode kernels#1283
DavidMansell wants to merge 1 commit intomainfrom
pr/add-sme1-fast-mode

Conversation

@DavidMansell
Copy link
Copy Markdown
Contributor

This is achieved with the following changes:

  • Change BF16 4VLx1VL SME and SME2 kernels to versions that consume 2VL-packed LHS input.
  • Remove now unneeded FP32->BF16 4VL interleave.
  • Replace 2VL and 1VL FP32->BF16 interleaves with SME versions.
  • Add SME BF16 kernels to gemm_fp32 kernel list.

Change-Id: I77ae62563fe22a3f207ffee46b5388a9dfd2ee19

This is achieved with the following changes:
 - Change BF16 4VLx1VL SME and SME2 kernels to versions that consume
   2VL-packed LHS input.
 - Remove now unneeded FP32->BF16 4VL interleave.
 - Replace 2VL and 1VL FP32->BF16 interleaves with SME versions.
 - Add SME BF16 kernels to gemm_fp32 kernel list.

Change-Id: I77ae62563fe22a3f207ffee46b5388a9dfd2ee19
Signed-off-by: David Mansell <David.Mansell@arm.com>
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022-2026 Arm Limited.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2022- should stay

@@ -1,5 +1,5 @@
/*
* Copyright (c) 2025-2026 Arm Limited.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, 2025- should stay. I think this is present in all the renamed files. All should preserve the old copyright years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants