c++ - cuda, pycuda -- how to write complex numbers -- errors:class "cuComplex" has no member "i" -
c++ - cuda, pycuda -- how to write complex numbers -- errors:class "cuComplex" has no member "i" -
i have difficulties utilize complex numbers in cuda,pycuda.
i have in c:
#include <complex> typedef std::complex<double> cmplx; .... cmplx j(0.,1.);
also,in same code:
#include <boost/python.hpp> #include <boost/array.hpp> ... typedef std::vector< boost::array<std::complex<double>,3 > > complexfieldtype; typedef std::vector< boost::array<double,3> > realfieldtype; ... __global__ void compute(realfieldtype const & rs,complexfieldtype const & m,..) ...
how can convert utilize pycuda? tried sth (according book 'cuda example'):
struct cucomplex { float real; float imag; cucomplex(float a,float b): real(a),imag(b){} cucomplex operator *(const cucomplex& a) { homecoming cucomplex(real*a.real -imag*a.imag ,imag*a.real +real*a.imag); } cucomplex operator +(const cucomplex& a) { homecoming cucomplex(real+a.real ,imag+a.imag); }; cucomplex j(0.,1.); //instead of cmplx j(0.,1.); __global__ void compute(float *rs,cucomplex * m,..) //instead of realfieldtype const & rs,complexfieldtype const & m ....
some of errors take are:
data fellow member initializer not allowed
this declaration has no storage class or type specifier
thank you!
---------------------edit----------------------------------------------
i did next using #include <pycuda-complex.hpp>
(relative above) :
pycuda::complex<float> cmplx; cmplx j(0.,1.);
and typedef std::vector< boost::array<std::complex<double>,3 > > complexfieldtype;
and complexfieldtype const & m
,inside global function, tried "float *m " or "cmplx *m".
until , getting error :
variable "cmplx" not type name
if utilize pycuda::complex cmplx; ,then get:
identifier "cmplx" undefined
name followed "::" must class or namespace name
also:
expression must have pointer-to-object type (but maybe part of code)
it isn't clear trying (if know yourself), , question getting progressively more confused edits , comments roll on. expand andreas's reply little, here simple, compilable piece of cuda code uses pycuda native complex type correctly:
#include <pycuda-complex.hpp> template<typename t> __global__ void kernel(const t * x, const t *y, t *z) { int tid = threadidx.x + blockdim.x * blockidx.x; z[tid] = x[tid] + y[tid]; } typedef pycuda::complex<float> scmplx; typedef pycuda::complex<double> dcmplx; template void kernel<float>(const float *, const float *, float *); template void kernel<double>(const double *, const double *, double *); template void kernel<scmplx>(const scmplx *, const scmplx *, scmplx *); template void kernel<dcmplx>(const dcmplx *, const dcmplx *, dcmplx *);
this gives single , double real , complex versions of trivial kernel , compiles nvcc this:
$ nvcc -arch=sm_20 -xptxas="-v" -i$home/pycuda-2011.1.2/src/cuda -c scmplx.cu ptxas info : compiling entry function '_z6kernelin6pycuda7complexideeevpkt_s5_ps3_' 'sm_20' ptxas info : used 12 registers, 44 bytes cmem[0], 168 bytes cmem[2], 4 bytes cmem[16] ptxas info : compiling entry function '_z6kernelin6pycuda7complexifeeevpkt_s5_ps3_' 'sm_20' ptxas info : used 8 registers, 44 bytes cmem[0], 168 bytes cmem[2] ptxas info : compiling entry function '_z6kernelidevpkt_s2_ps0_' 'sm_20' ptxas info : used 8 registers, 44 bytes cmem[0], 168 bytes cmem[2] ptxas info : compiling entry function '_z6kernelifevpkt_s2_ps0_' 'sm_20' ptxas info : used 4 registers, 44 bytes cmem[0], 168 bytes cmem[2]
perhaps goes someway answering question....
c++ cuda complex-numbers pycuda
Comments
Post a Comment