Discussion about this post

User's avatar
Andy's avatar

this looks really promising! I do wonder though, would the ideation strategies for writing these kernels need to change depending on the gpu being optimized for? for instance, an algorithm might be compute-bound on a high-core-count GPU like an H100, but memory-bound on lower-end consumer gpus. curious how the system handles this variability across hardware tiers.

Expand full comment

No posts