Depending on whether an af::array is const or not, the same indexing operation produces an array with different dimensions.
Description
- I encountered this bug while refactoring some of my code. The code was a single big function, which I started breaking down into several smaller functions. I carefully marked const the function parameters that should be constant, and was surprised when this simple refactoring (which shouldn't have changed the output at all) all of a sudden made my program crash.
- Did you build ArrayFire yourself or did you use the official installers: Official installers
- Which backend is experiencing this issue? All
- Do you have a workaround? Yes (don't use const, which isn't great)
- Can the bug be reproduced reliably on your system? Yes
- A clear and concise description of what you expected to happen: I expect that operations on a non-const af::array form a strict super-set of operations on a const af::array. In other words, if I can do operation B a const array, I should be able to do the same operation B, with the exact same output, on a non-const array.
- Run your executable with AF_TRACE=all and AF_PRINT_ERRORS=1 environment
variables set, screenshot or terminal output of the results:
[platform][1609772142][007696] [ ..\src\backend\common\DependencyModule.cpp(99) ] Attempting to load: forge.dll
[platform][1609772142][007696] [ ..\src\backend\common\DependencyModule.cpp(102) ] Found: forge.dll
[platform][1609772142][007696] [ ..\src\backend\cuda\device_manager.cpp(428) ] CUDA Driver supports up to CUDA 10.1 ArrayFire CUDA Runtime 10.1
[platform][1609772142][007696] [ ..\src\backend\cuda\device_manager.cpp(495) ] Found 1 CUDA devices
[platform][1609772142][007696] [ ..\src\backend\cuda\device_manager.cpp(521) ] Found device: GeForce GTX 1070 (8 GB | ~6256.71 GFLOPs | 15 SMs)
[platform][1609772142][007696] [ ..\src\backend\cuda\device_manager.cpp(556) ] AF_CUDA_DEFAULT_DEVICE:
[platform][1609772142][007696] [ ..\src\backend\cuda\device_manager.cpp(575) ] Default device: 0(GeForce GTX 1070)
[mem][1609772143][007696] [ ..\src\backend\cuda\memory.cpp(158) ] nativeAlloc: 1 KB 0xb04e00000
Reproducible Code and/or Steps
#include <arrayfire.h>
#include <iostream>
int main() {
try {
const std::size_t dim = 10;
// Create a simple matrix
af::array m(dim,dim);
// Keep a constant reference to it
const af::array& cm = m;
af::seq i(af::seq(dim), true);
af::array r1 = m(af::span, i);
af::array r2 = cm(af::span, i);
std::cout << "dim r1: " << r1.dims(0) << "," << r1.dims(1) << std::endl; // prints "dim r1: 10,1"
std::cout << "dim r2: " << r2.dims(0) << "," << r2.dims(1) << std::endl; // prints "dim r2: 10,10" !
} catch (const af::exception& e) {
std::cout << e.what() << std::endl;
return 1;
}
return 0;
}
Compile the above example with any backend, then notice how the two arrays r1 and r2 have different dimensions, even though they were generated from the almost the same expression. FYI, the original code that triggered this problem was a bit more involved, I stripped it down to its barest bone that reproduces the problem. The original calculation, which makes more sense, is:
/*const*/ af::array& m = /* input */;
af::array r(m.dims(1));
gfor(af::seq i, static_cast<double>(m.dims(1)))
{
r(i) = af::sum(m(af::span, i) * m(af::span, i));
}
It is effectively auto r = af::diag(af::matmulNT(m,m)), but faster. Happy to learn if there's a better way :)
System Information
Please provide the following information:
- ArrayFire version: 3.7.2
- Devices installed on the system: Nvidia GeForce GTX 1070 (8GB)
- (optional) Output from the af::info() function if applicable.
ArrayFire v3.7.2 (CUDA, 64-bit Windows, build 218dd2c)
Platform: CUDA Runtime 10.1, Driver: 10010
[0] GeForce GTX 1070, 8192 MB, CUDA Compute 6.1
- Output from the following scripts:
Windows: nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 432.00 Driver Version: 432.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 WDDM | 00000000:65:00.0 Off | N/A |
| 0% 49C P8 14W / 151W | 559MiB / 8192MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
Checklist
- Using the latest available ArrayFire release
- GPU drivers are up to date
Depending on whether an af::array is const or not, the same indexing operation produces an array with different dimensions.
Description
variables set, screenshot or terminal output of the results:
Reproducible Code and/or Steps
Compile the above example with any backend, then notice how the two arrays r1 and r2 have different dimensions, even though they were generated from the almost the same expression. FYI, the original code that triggered this problem was a bit more involved, I stripped it down to its barest bone that reproduces the problem. The original calculation, which makes more sense, is:
It is effectively auto r = af::diag(af::matmulNT(m,m)), but faster. Happy to learn if there's a better way :)
System Information
Please provide the following information:
Windows: nvidia-smi
Checklist