Normally you should not call directly those Ops! Theano should automatically transform cpu ops to their gpu equivalent. So this list is just useful to let people know what is implemented on the gpu.
Implement AdvancedIncSubtensor1 on the gpu.
Implement AdvancedSubtensor1 on the gpu.
Implement Alloc on the gpu.
The memset_0 param is an optimization. When True, we call cudaMalloc that is faster.
GpuCAReduce is a Reduction along some dimensions by a scalar op.
The dimensions along which to reduce is specified by the reduce_mask that you pass to the constructor. The reduce_mask is a tuple of booleans (actually integers 0 or 1) that specify for each input dimension, whether to reduce it (1) or not (0).
For example, when scalar_op is a theano.scalar.basic.Add instance:
- reduce_mask == (1,) sums a vector to a scalar
- reduce_mask == (1,0) computes the sum of each column in a matrix
- reduce_mask == (0,1) computes the sum of each row in a matrix
- reduce_mask == (1,1,1) computes the sum of all elements in a 3-tensor.
Note : | any reduce_mask of all zeros is a sort of ‘copy’, and may be removed during graph optimization |
---|
This Op is a work in progress.
This op was recently upgraded from just GpuSum a general CAReduce. Not many code cases are supported for scalar_op being anything other than scal.Add instances yet.
Important note: if you implement new cases for this op, be sure to benchmark them and make sure that they actually result in a speedup. GPUs are not especially well-suited to reduction operations so it is quite possible that the GPU might be slower for some cases.
Parameters: | N – the number of 1s in the pattern N=1 -> 01, N=2 -> 011, N=3 ->0111. Works for N=1,2,3 |
---|
Reduce.
WRITEME IG: I believe, based on how this is called in c_code, that it is for the case where we are reducing on all axes and x is C contiguous.
Returns True if the current op and reduce pattern has functioning C code
Always return a c contiguous output. Copy the input only if it is not already c contiguous.
Implement DimShuffle on the gpu.
Implement a generic elemwise on the gpu.
Implement Flatten on the gpu.
Implement the transfer from cpu to the gpu.
Implement IncSubtensor on the gpu.
view: string, C code expression for an array source: string, C code expression for an array
returns a C code expression to copy source into view, and return 0 on success
Parameters: | x – a string giving the name of a C variable pointing to an array |
---|---|
Returns: | C code expression to make a copy of x |
Base class uses PyArrayObject *, subclasses may override for different types of arrays.
Should raise NotImplementedError if c_code does not support the types involved in this node.
Return a dictionary of arguments to use with helper_c_code
Parameters: |
|
---|
This doesn’t need to actually set up the view with the right indexing; we’ll do that manually later.
Implement Join on the gpu.
Implement Reshape on the gpu.
Implement Shape on the gpu.
Implement subtensor on the gpu.
Implement the transfer from gpu to the cpu.
Return a symbolic column variable (ndim=2, broadcastable=[False,True]). :param dtype: numeric type (None means to use theano.config.floatX) :param name: a name to attach to this variable
Return a symbolic matrix variable. :param dtype: numeric type (None means to use theano.config.floatX) :param name: a name to attach to this variable
Return a symbolic row variable (ndim=2, broadcastable=[True,False]). :param dtype: numeric type (None means to use theano.config.floatX) :param name: a name to attach to this variable
Return a symbolic scalar variable. :param dtype: numeric type (None means to use theano.config.floatX) :param name: a name to attach to this variable
Return a symbolic 3-D variable. :param dtype: numeric type (None means to use theano.config.floatX) :param name: a name to attach to this variable
Return a symbolic 4-D variable. :param dtype: numeric type (None means to use theano.config.floatX) :param name: a name to attach to this variable
Return a symbolic vector variable. :param dtype: numeric type (None means to use theano.config.floatX) :param name: a name to attach to this variable
Implement the batched and stacked 2d convolution on the gpu.
Implement dot(2d, 2d) on the gpu.
Implement dot(2d, 2d) * scalar on the gpu.
Implement downsample with max on the gpu.
Implement the grad of downsample with max on the gpu.
implement the gemm on the gpu.
implement gemv on the gpu.
implement ger on the gpu.
Implement CrossentropySoftmax1HotWithBiasDx on the gpu.
Gradient wrt x of the CrossentropySoftmax1Hot Op
Implement CrossentropySoftmaxArgmax1HotWithBias on the gpu.
Implement Softmax on the gpu.
Implement SoftmaxWithBias on the gpu.
Random generator based on the CURAND libraries. It is not inserted automatically.
Define CURAND_RandomStreams - backed by CURAND
Base class for a random number generator implemented in CURAND.
The random number generator itself is an opaque reference managed by CURAND. This Op uses a generic-typed shared variable to point to a CObject that encapsulates this opaque reference.
Each random variable is created with a generator of False. The actual random number generator is allocated from the seed, on the first call to allocate random numbers (see c_code).
Note : | One caveat is that the random number state is simply not serializable. Consequently, attempts to serialize functions compiled with these random numbers will fail. |
---|
Return an destructive version of self
Return a symbolic sample from generator.
cls dictates the random variable (e.g. uniform, normal)
Op to draw normal numbers using CURAND
RandomStreams instance that creates CURAND-based random variables.
One caveat is that generators are not serializable.
Return a unique seed for initializing a random variable.
Return symbolic tensor of normally-distributed numbers.
Param : | size: Can be a list of integer or Theano variable(ex: the shape of other Theano Variable) |
---|
Return symbolic tensor of uniform numbers.
List of all (old, new) generator update pairs created by this instance.
Op to draw uniform numbers using CURAND