← 返回首页
Ragged reduction by syurkevi · Pull Request #2786 · arrayfire/arrayfire · GitHub
Skip to content

Navigation Menu

Toggle navigation
Sign in
Appearance settings
Search or jump to...

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Resetting focus

Ragged reduction#2786

Merged
9prady9 merged 9 commits into
arrayfire:masterfrom
syurkevi:ragged_max
Apr 7, 2020
Merged

Ragged reduction#2786
9prady9 merged 9 commits into
arrayfire:masterfrom
syurkevi:ragged_max

Conversation

Copy link
Copy Markdown
Contributor

syurkevi commented Mar 9, 2020

Addresses #2782 .

TODO:

  • tests
  • other ragged functions(min, sum, product, count, any, all)

Comment thread include/af/algorithm.h Outdated Show resolved Hide resolved
Comment thread include/af/algorithm.h Outdated Show resolved Hide resolved
Comment thread include/af/algorithm.h Outdated

\note NaN values are ignored
*/
AFAPI void max(array &val, array &idx, const array &in, const int dim, const array &ragged_len);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

Opinion: I think ragged_len should come in after the in array.

Comment thread src/api/c/reduce.cpp
}

template<af_op_t op>
static af_err rreduce_common(af_array *val, af_array *idx, const af_array in,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

Will an overload to reduce_common not work?

Comment thread src/api/c/reduce.cpp Outdated Show resolved Hide resolved
Comment thread src/api/c/reduce.cpp Outdated Show resolved Hide resolved
Comment thread include/af/algorithm.h Show resolved Hide resolved
Comment thread src/api/c/reduce.cpp Outdated Show resolved Hide resolved
Comment thread src/backend/cpu/ireduce.cpp Outdated

template<af_op_t op, typename T>
void rreduce(Array<T> &out, Array<uint> &loc, const Array<T> &in,
const int dim, const Array<uint> &rlen) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

Is this function still needed? It looks like you can combine this with ireduce.

Copy link
Copy Markdown
Contributor

WilliamTambellini commented Mar 10, 2020
edited
Loading

@syurkevi sounds good:
ArrayFire v3.7.0 (CUDA, 64-bit Linux, build 4206fa3)
Platform: CUDA Runtime 10.2, Driver: 440.33.01
[0] Quadro RTX 3000, 5935 MB, CUDA Compute 7.5
Benchmark max at f32
in max ragmax
4 16384 0.038 0.037
8 16384 0.034 0.036
16 16384 0.035 0.037
32 16384 0.037 0.040
64 16384 0.074 0.081
128 16384 0.148 0.159
256 16384 0.293 0.309

Though f16 not really faster than f32:
Device 0 isHalfAvailable ? yes
ArrayFire v3.7.0 (CUDA, 64-bit Linux, build 4206fa3)
Platform: CUDA Runtime 10.2, Driver: 440.33.01
[0] Quadro RTX 3000, 5935 MB, CUDA Compute 7.5
Benchmark max at f16
in max ragmax
4 16384 0.040 0.037
8 16384 0.035 0.037
16 16384 0.036 0.039
32 16384 0.038 0.040
64 16384 0.067 0.070
128 16384 0.143 0.152
256 16384 0.282 0.298

Could you please just add a minimalist bench like this one:
https://gist.github.com/WilliamTambellini/0f9309ab27b7076369f24c3217af4ffd
?

Copy link
Copy Markdown
Contributor

@syurkevi could you please at least rebase for me to see what needs to be finished and advise accordingly ?

Copy link
Copy Markdown
Member

9prady9 commented Mar 16, 2020

@syurkevi I took care of rebase from latest master. If you are adding more ragged functions and need to touch the ireduce kernel. You can find the kernels in the file src/backend/cuda/kernel/ireduce.cuh and the kernel wrappers inside src/backend/cuda/kernel/ireduce.hpp. The backend source file is src/backend/cuda/ireduce.cpp and not ireduce.cu.

If you face any issues while editing/adding new things to kernels, please ping me. I can guide you through the nvrtc related changes.

Copy link
Copy Markdown
Contributor

Thank you @9prady9
@syurkevi I am only interested by the max reduction, as agreed with @umar456 when ordering the support. Could you please limit that PR to the max/raggedmax reduction ?
Kind regards

syurkevi force-pushed the ragged_max branch 2 times, most recently from e502b36 to 6599dd2 Compare March 21, 2020 08:10
umar456 added this to the 3.8.0 milestone Mar 24, 2020
umar456 changed the title Ragged reduction [WIP] Ragged reduction Mar 27, 2020
syurkevi force-pushed the ragged_max branch 2 times, most recently from aa0a856 to 2307ccf Compare March 31, 2020 21:57
9prady9 merged commit 0a66851 into arrayfire:master Apr 7, 2020
umar456 mentioned this pull request Apr 8, 2020
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Footer

© 2026 GitHub, Inc.