Commit graph

29 commits

Author SHA1 Message Date
Kovid Goyal
a7c06b38e6
We dont actually need vzeroupper at start of function
GCC emits vzeroupper automatically when compiling with native
optimizations but we still need it otherwise
2024-02-25 09:57:43 +05:30
Kovid Goyal
720618bc37
Use go 1.22 for building
It supports PCALIGN on non ARM arches as well
2024-02-25 09:57:43 +05:30
Kovid Goyal
ede4d7fbca
... 2024-02-25 09:57:42 +05:30
Kovid Goyal
c01b959723
Fix Go unaligned index implementation 2024-02-25 09:57:42 +05:30
Kovid Goyal
7467307200
Add some alignment tests 2024-02-25 09:57:42 +05:30
Kovid Goyal
bbdb0b15f3
DRYer 2024-02-25 09:57:42 +05:30
Kovid Goyal
b5edd9ad57
Dont precalculate mask in loop body
No need since we dont shift. Avoids the extra mask instructions for the
not found case.
2024-02-25 09:57:42 +05:30
Kovid Goyal
f9fd6ffd46
Use only aligned loads for index funcs
Also obviates the necessity for safe slice wrappers
2024-02-25 09:57:41 +05:30
Kovid Goyal
31a5fcf297
DRYer 2024-02-25 09:57:41 +05:30
Kovid Goyal
561712090d
Fix cmplt implementation 2024-02-25 09:57:41 +05:30
Kovid Goyal
d9190ea675
DRYer 2024-02-25 09:57:41 +05:30
Kovid Goyal
57f4ea4d4a
Add some tests for broadcast from constant intrinsic 2024-02-25 09:57:41 +05:30
Kovid Goyal
9b0ae8d403
Dont use VEX encoded instructions for 128 bit ISA 2024-02-25 09:57:41 +05:30
Kovid Goyal
aed0611fb8
Avoid double trailing RET 2024-02-25 09:57:40 +05:30
Kovid Goyal
5a5e31c38b
Also zero upper at start of function 2024-02-25 09:57:40 +05:30
Kovid Goyal
db2e0e816d
Fix mixing of register types in the same function 2024-02-25 09:57:40 +05:30
Kovid Goyal
a298781b85
DRYer 2024-02-25 09:57:40 +05:30
Kovid Goyal
d5cd9ef2ca
... 2024-02-25 09:57:40 +05:30
Kovid Goyal
da31db3212
... 2024-02-25 09:57:40 +05:30
Kovid Goyal
601c4ad4df
Fix some typos 2024-02-25 09:57:40 +05:30
Kovid Goyal
68d800d4fa
make clean should clean generated asm as well 2024-02-25 09:57:40 +05:30
Kovid Goyal
9fc3db1dd1
Work on C0 index func 2024-02-25 09:57:40 +05:30
Kovid Goyal
161eae78b6
Make generated asm_* files world readable 2024-02-25 09:57:40 +05:30
Kovid Goyal
77cfd44f24
More efficient clearing of register to all zeros or all ones 2024-02-25 09:57:39 +05:30
Kovid Goyal
59be7213cf
Make set1_epi8 more general 2024-02-25 09:57:39 +05:30
Kovid Goyal
d60dacbd09
Implement > and < intrinsics for vector registers 2024-02-25 09:57:39 +05:30
Kovid Goyal
82b7b4fcce
Make a re-useable template for generating ASM index functions with different tests 2024-02-25 09:57:39 +05:30
Kovid Goyal
4e6138d785
Generate SIMD code during build 2024-02-25 09:57:39 +05:30
Kovid Goyal
de8c1e0206
Work on porting SIMD vt arser to Go for the kittens 2024-02-25 09:57:39 +05:30