Magnus Ulimoen
8873f458b4
useless conversion
2021-06-29 18:07:26 +02:00
Magnus Ulimoen
9e2ce3ae24
Add Metrics iterator
2021-03-26 16:27:45 +01:00
Magnus Ulimoen
a33e1d37ba
Document what compiler is doing for diffxi
2021-03-25 22:39:00 +01:00
Magnus Ulimoen
76f5291131
Elide bounds check in diffxi
...
array_windows.skip did not elide bounds checks as it should. If
the slice is instead offset by the skipped amount, we have the
same behavour, but aids the compiler enough.
The two changed lines allows SIMD optimisations, giving an
impressive reduction in instructions by two thirds in the
benchmark.
2021-03-25 17:23:01 +01:00
Magnus Ulimoen
4ae5c02bb1
Replace FastFloat with mul_add
2021-03-23 19:21:38 +01:00
Magnus Ulimoen
7aadda3de9
Move integrate to separate crate
2021-03-22 17:49:35 +01:00
Magnus Ulimoen
be1330ec02
Add constrmatrix as separate crate
2021-03-22 16:24:32 +01:00
Magnus Ulimoen
502679c9a1
Move Float to separate crate
2021-03-22 16:17:27 +01:00
Magnus Ulimoen
be984fbdac
Bump sprs to 0.10
2021-03-18 23:27:03 +01:00
Magnus Ulimoen
550b43b4cd
Bump ndarray
2021-03-16 19:03:35 +01:00
Magnus Ulimoen
f098981d3e
Update email
2021-03-16 19:00:24 +01:00
Magnus Ulimoen
8383517ba3
ensure slice can be cast to Matrix
2021-03-15 20:18:19 +01:00
Magnus Ulimoen
17ab18e953
zero-pad diffxi kernel
2021-03-15 20:07:41 +01:00
Magnus Ulimoen
e43e71a4d8
Make flip_XX impl on Matrix
2021-03-15 19:31:41 +01:00
Magnus Ulimoen
6fc045ae17
Replace transmute with cast
2021-02-12 19:02:13 +01:00
Magnus Ulimoen
a02c7daafc
remove iterator inhibiting optimisation
2021-02-10 21:17:05 +01:00
Magnus Ulimoen
02175d1734
use some unsafe...
2021-02-10 19:29:26 +01:00
Magnus Ulimoen
8a6dc60edf
remove some unsafe from simd
2021-02-10 19:02:48 +01:00
Magnus Ulimoen
87c055f81e
Add back simd column algo
2021-02-09 21:44:35 +01:00
Magnus Ulimoen
cf4d8f1e9b
add Zero for constmatrix
2021-02-03 08:41:11 +01:00
Magnus Ulimoen
7ec426b5a8
add more fast intrinsics
2021-02-03 08:31:33 +01:00
Magnus Ulimoen
64a4e92dd2
specialise on contigous ny
2021-02-02 00:33:37 +01:00
Magnus Ulimoen
c709cf465e
remove ndarray transmute
2021-02-02 00:12:03 +01:00
Magnus Ulimoen
74d99a4a18
try ndarray transmute
2021-02-02 00:12:03 +01:00
Magnus Ulimoen
b15ea57e6d
inline for perf
2021-02-02 00:12:03 +01:00
Magnus Ulimoen
299b4f8083
ensure FastFloat flag works
2021-02-02 00:12:03 +01:00
Magnus Ulimoen
6f7268bf33
use matrices everywhere
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
31ac46e386
move data structs into separate files
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
f7f8a7ffff
make flip const functions
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
1f3aa2c116
make core-intrinsics cfg'ed
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
c73c6e7407
diff_op_col_naive_matrix
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
c660354c3f
Matrix for Upwind4
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
45e4d51513
remove a closure
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
b0e1ec62f8
change order in matmul_into
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
481f2d607e
remove a lot of unsafe, lost perf
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
00f3ba6a01
use split_at_mut
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
db94caf2b2
change repr of Matrix
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
c133557459
use Matrix in SBP diff
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
3c7cc4605a
simplify traits
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
f7c238f6a7
add blockend SBP8
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
a7660281c8
add inline to remove magic fix
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
14fefe97ab
add Matrix approach for SBP8
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
3d34f2e7a0
add fast-float feature
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
4c2daf5933
minor thingys
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
db552af4ff
add blockend with weird caveat
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
bcda26a512
15% reductions in instr count
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
36293e75e6
10% instr reduction with fast_ intr
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
30c563c19d
working d1 SBP4
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
94e8fb5b7c
checkpoint
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
c104082ac0
add matrix type
2021-02-02 00:12:02 +01:00