Magnus Ulimoen
4c5c0305e4
Bump ndarray approx
2022-07-19 20:33:50 +02:00
Magnus Ulimoen
5acd46af6d
Upgrade dependencies
2022-07-05 21:59:27 +02:00
Magnus Ulimoen
2a1bb3f815
Use serde1 feature
2022-07-05 20:19:49 +02:00
Magnus Ulimoen
bb1909c2a8
Use StdFloat to fix compile error
2022-07-05 19:51:48 +02:00
Magnus Ulimoen
f40de866ce
Clippy lints
2022-05-17 08:24:33 +02:00
Magnus Ulimoen
70cab01334
Use weak dependencies
2022-05-17 08:03:05 +02:00
Magnus Ulimoen
cfeb30fac0
Small clippy lint fixes
2022-02-25 20:43:57 +01:00
Magnus Ulimoen
9679ae5ba2
Remove superfluous import
2021-11-21 11:21:05 +01:00
Magnus Ulimoen
d16b274fe0
Update hdf5/ndarray
2021-10-23 19:35:37 +00:00
Magnus Ulimoen
0ec3e16566
Align portable-simd with master
2021-09-30 05:15:31 +00:00
Magnus Ulimoen
d0901f5755
SAT boundaries for multi-thread fixing
2021-08-21 09:29:45 +00:00
Magnus Ulimoen
d2c811d3af
Rework wait primitive to condvar
2021-08-20 15:57:12 +00:00
Magnus Ulimoen
2d473b8255
Clippy lints
2021-08-16 20:33:57 +00:00
Magnus Ulimoen
3edd18c4fd
Use core_simd over packed_simd
2021-07-27 18:32:22 +00:00
Magnus Ulimoen
8873f458b4
useless conversion
2021-06-29 18:07:26 +02:00
Magnus Ulimoen
9e2ce3ae24
Add Metrics iterator
2021-03-26 16:27:45 +01:00
Magnus Ulimoen
a33e1d37ba
Document what compiler is doing for diffxi
2021-03-25 22:39:00 +01:00
Magnus Ulimoen
76f5291131
Elide bounds check in diffxi
...
array_windows.skip did not elide bounds checks as it should. If
the slice is instead offset by the skipped amount, we have the
same behavour, but aids the compiler enough.
The two changed lines allows SIMD optimisations, giving an
impressive reduction in instructions by two thirds in the
benchmark.
2021-03-25 17:23:01 +01:00
Magnus Ulimoen
4ae5c02bb1
Replace FastFloat with mul_add
2021-03-23 19:21:38 +01:00
Magnus Ulimoen
7aadda3de9
Move integrate to separate crate
2021-03-22 17:49:35 +01:00
Magnus Ulimoen
be1330ec02
Add constrmatrix as separate crate
2021-03-22 16:24:32 +01:00
Magnus Ulimoen
502679c9a1
Move Float to separate crate
2021-03-22 16:17:27 +01:00
Magnus Ulimoen
be984fbdac
Bump sprs to 0.10
2021-03-18 23:27:03 +01:00
Magnus Ulimoen
550b43b4cd
Bump ndarray
2021-03-16 19:03:35 +01:00
Magnus Ulimoen
f098981d3e
Update email
2021-03-16 19:00:24 +01:00
Magnus Ulimoen
8383517ba3
ensure slice can be cast to Matrix
2021-03-15 20:18:19 +01:00
Magnus Ulimoen
17ab18e953
zero-pad diffxi kernel
2021-03-15 20:07:41 +01:00
Magnus Ulimoen
e43e71a4d8
Make flip_XX impl on Matrix
2021-03-15 19:31:41 +01:00
Magnus Ulimoen
6fc045ae17
Replace transmute with cast
2021-02-12 19:02:13 +01:00
Magnus Ulimoen
a02c7daafc
remove iterator inhibiting optimisation
2021-02-10 21:17:05 +01:00
Magnus Ulimoen
02175d1734
use some unsafe...
2021-02-10 19:29:26 +01:00
Magnus Ulimoen
8a6dc60edf
remove some unsafe from simd
2021-02-10 19:02:48 +01:00
Magnus Ulimoen
87c055f81e
Add back simd column algo
2021-02-09 21:44:35 +01:00
Magnus Ulimoen
cf4d8f1e9b
add Zero for constmatrix
2021-02-03 08:41:11 +01:00
Magnus Ulimoen
7ec426b5a8
add more fast intrinsics
2021-02-03 08:31:33 +01:00
Magnus Ulimoen
64a4e92dd2
specialise on contigous ny
2021-02-02 00:33:37 +01:00
Magnus Ulimoen
c709cf465e
remove ndarray transmute
2021-02-02 00:12:03 +01:00
Magnus Ulimoen
74d99a4a18
try ndarray transmute
2021-02-02 00:12:03 +01:00
Magnus Ulimoen
b15ea57e6d
inline for perf
2021-02-02 00:12:03 +01:00
Magnus Ulimoen
299b4f8083
ensure FastFloat flag works
2021-02-02 00:12:03 +01:00
Magnus Ulimoen
6f7268bf33
use matrices everywhere
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
31ac46e386
move data structs into separate files
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
f7f8a7ffff
make flip const functions
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
1f3aa2c116
make core-intrinsics cfg'ed
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
c73c6e7407
diff_op_col_naive_matrix
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
c660354c3f
Matrix for Upwind4
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
45e4d51513
remove a closure
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
b0e1ec62f8
change order in matmul_into
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
481f2d607e
remove a lot of unsafe, lost perf
2021-02-02 00:12:02 +01:00
Magnus Ulimoen
00f3ba6a01
use split_at_mut
2021-02-02 00:12:02 +01:00