You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
//TODO: either figure out why dynamically allocated arrays weren't working, or use a #define to statically allocate
pycuda::complex<double> buf1[81];
pycuda::complex<double> buf2[81];
pycuda::complex<double>* H_cur = buf1;
pycuda::complex<double>* H_next = buf2;
I originally tried using dynamically allocated arrays (commented out lines), but ran into some trouble getting the code to work which was solved by using statically allocated buffers and simple pointers to those buffers.
This solved the problem getting the code to work, but assumes a specific matrix size.
Either, we need to solve the dynamically allocated arrays, or we can use some slight metaprogramming to solve this problem. SInce we compile just before using, we know the size of this array at compile time (in fact, we know the size of all of our arrays at compile time. This allows us to use the stack rather than the heap for some of our arrays. Stack allocated memory can be easier to use, as it is allocated and freed implicitly.
Ways we could go about this: python format strings using format to replace certain fields with the numbers we want. This could actually help (probably not noticeably) the transfer costs, as many of the things we need could be transferred in the instructions themselves rather than as parameters.
Otherwise we could use a C preprocessor macro to #define VEC_SIZE 9 and then use VEC_SIZE * VEC_SIZE where the square is needed. This is, in my opinion slightly more elegant, though may take the compiler (a probably trival amount of ) time. This way, a single line can be added to the source code handed to the compiler, and not have to worry about the rest
The text was updated successfully, but these errors were encountered:
Have you tried single precision complex arrays? Also, did you build your own numpy library from source? Things may have changed greatly since this was last commented.
WrightSim/WrightSim/mixed/propagate.py
Lines 177 to 185 in 8c23c24
I originally tried using dynamically allocated arrays (commented out lines), but ran into some trouble getting the code to work which was solved by using statically allocated buffers and simple pointers to those buffers.
This solved the problem getting the code to work, but assumes a specific matrix size.
Either, we need to solve the dynamically allocated arrays, or we can use some slight metaprogramming to solve this problem. SInce we compile just before using, we know the size of this array at compile time (in fact, we know the size of all of our arrays at compile time. This allows us to use the stack rather than the heap for some of our arrays. Stack allocated memory can be easier to use, as it is allocated and freed implicitly.
Ways we could go about this: python format strings using format to replace certain fields with the numbers we want. This could actually help (probably not noticeably) the transfer costs, as many of the things we need could be transferred in the instructions themselves rather than as parameters.
Otherwise we could use a C preprocessor macro to
#define VEC_SIZE 9
and then useVEC_SIZE * VEC_SIZE
where the square is needed. This is, in my opinion slightly more elegant, though may take the compiler (a probably trival amount of ) time. This way, a single line can be added to the source code handed to the compiler, and not have to worry about the restThe text was updated successfully, but these errors were encountered: