Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Sign in / Register
Toggle navigation
S
Stable Diffusion Webui
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Locked Files
Issues
0
Issues
0
List
Boards
Labels
Service Desk
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Security & Compliance
Security & Compliance
Dependency List
License Compliance
Packages
Packages
List
Container Registry
Analytics
Analytics
CI / CD
Code Review
Insights
Issues
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
novelai-storage
Stable Diffusion Webui
Commits
f586f497
Commit
f586f497
authored
Dec 18, 2023
by
Nuullll
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Fix device id
parent
e4b4a9c4
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
2 deletions
+5
-2
modules/xpu_specific.py
modules/xpu_specific.py
+5
-2
No files found.
modules/xpu_specific.py
View file @
f586f497
...
...
@@ -33,7 +33,7 @@ has_xpu = check_for_xpu()
# so that SDPA of each chunk wouldn't require any allocation larger than ARC_SINGLE_ALLOCATION_LIMIT.
# The heuristic limit (TOTAL_VRAM // 8) is tuned for Intel Arc A770 16G and Arc A750 8G,
# which is the best trade-off between VRAM usage and performance.
ARC_SINGLE_ALLOCATION_LIMIT
=
min
(
torch
.
xpu
.
get_device_properties
(
shared
.
cmd_opts
.
device_id
)
.
total_memory
//
8
,
4
*
1024
*
1024
*
1024
)
ARC_SINGLE_ALLOCATION_LIMIT
=
{}
orig_sdp_attn_func
=
torch
.
nn
.
functional
.
scaled_dot_product_attention
def
torch_xpu_scaled_dot_product_attention
(
query
,
key
,
value
,
attn_mask
=
None
,
dropout_p
=
0.0
,
is_causal
=
False
,
*
args
,
**
kwargs
...
...
@@ -49,7 +49,10 @@ def torch_xpu_scaled_dot_product_attention(
Ev
=
value
.
size
(
-
1
)
# Embedding dimension of the value
total_batch_size
=
torch
.
numel
(
torch
.
empty
(
N
))
batch_size_limit
=
max
(
1
,
ARC_SINGLE_ALLOCATION_LIMIT
//
(
L
*
S
*
query
.
element_size
()))
device_id
=
query
.
device
.
index
if
device_id
not
in
ARC_SINGLE_ALLOCATION_LIMIT
:
ARC_SINGLE_ALLOCATION_LIMIT
[
device_id
]
=
min
(
torch
.
xpu
.
get_device_properties
(
device_id
)
.
total_memory
//
8
,
4
*
1024
*
1024
*
1024
)
batch_size_limit
=
max
(
1
,
ARC_SINGLE_ALLOCATION_LIMIT
[
device_id
]
//
(
L
*
S
*
query
.
element_size
()))
if
total_batch_size
<=
batch_size_limit
:
return
orig_sdp_attn_func
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment