PyTorch GitHub commits
Author information
Name: Jagadish Krishnamoorthy
Emails:
- jagadish.krishnamoorthy@amd.com (recent)
- jagdish.krishna@gmail.com (older)
Commits till: 2/10/2026
Commit history
- f8454dc4e04 - [ROCm] Enable scaled group mm on gfx950 (#173737)
- ea123f27ee0 - CUDAScaledBlas - replace FBGEMM_GENAI with MSLK (#172988)
- 0f2064707a3 - [ROCm] Unifying hipBLASLt architecture lists into common hook methods (#172791)
- 65eeb563478 - fix build error
- 9ae6009bc34 - GroupBlas - Check fnuz type only for gfx942
- c118e1fa5b3 - [ROCm] Enable scaled group mm on gfx950
- a6dc0642eb4 - [ROCm] Use HIPCachingAllocator for CK argument and workspace buffers (#172311)
- a7ee6e41091 - [ROCm] Add unit test to verify grouped GEMM CK opt‑in flag (#171901)
- 188a1ee7549 - [ROCm] Make grouped GEMM CK opt‑in via env and default to fallback path (#171140)
- 282d2eb4047 - [ROCm] Refactor ROCm CK config generation into shared helper (#171121)
- a69907a41e0 - [ROCm] Make grouped GEMM CK opt‑in via env and default to fallback path (#170159)
- 62985304339 - [ROCm] inductor/fp8 test: Check for "cuda" in device type. (#170254)
- 5058132088b - [ROCm] Enable group gemm on gfx90a (#169356)
- 4887c46900e - [ROCm] Fix HIP document url. (#168220)
- f9b81e23e46 - [ROCm] Disable group gemm CK path when composable kernel (CK) is not enabled (#167403)
- dc00842b81b - [ROCm][CI] trigger magma build with gfx950 for ROCm7.1 (#167390)
- 32d30d96cf2 - [ROCm][CI] unconditionally add gfx950, gfx115x to PYTORCH_ROCM_ARCH (#167299)
- af829c0dade - [ROCm] Skip nvfp4 tests on ROCm (#167066)
- c17aa0f1130 - [ROCm] Enable group gemm through CK (#166334)
- 1fa520ea654 - [ROCm] Enable group gemm through CK (#166334)
- 34ed7a8f0d1 - [ROCm] Skip test_blockwise_nvfp4_with_global_scale (#165968)
- 8951df03ded - test_scaled_matmul_cuda: fix infer_scale_swizzle (#165788)
- 7669ac94028 - [ROCm] Add scaled_mm v2 support. (#165528)
- c7e30ae4dd9 - MX: Remove redundant PLATFORM_SUPPORTS_MX_GEMM constant (#164320)
- 264e7f68a09 - [ROCm] Fix mx fp8 and fp4 code after scaling refactor changes. (#163127)
- 8bc4a467a7c - [ROCm] test_aot_inductor: Enable fp8 tests. (#163050)
- 01c3c891c19 - [ROCm] Enable test_fixed_striding (#162787)
- 6944d4b6397 - [ROCm] rocblas Aten GEMM overload for FP32 output from FP16/BF16 inputs (#162600)
- a8d6943d36c - ROCm: Enable overload tests from test_matmul_cuda (#161540)
- d2b8c0d431e - forward fix of #152198 (#161166)
- 543896fcf33 - test_matmul_cuda: Refine MX test skipping (#161009)
- 0d99b4e9e29 - ROCm: Enable tf32 testing on test_nn (#148945)
- 6fa1b171955 - ROCm: Add trailing comma for consistency in gfx architecture list (#150250)
- ed9c8a5d136 - ROCm: Disable torch check for Multiplication of two Float8_e5m2 matrices (#148228)
- 0ea5d1067bc - ROCm: Remove static specifier for allow_tf32 variable. (#147186)
- 17e05cde0c4 - ROCm: Skip tests in elastic/utils/distributed_test (#144692)
- 8f3eb843730 - ROCm: Enable 4 gpu tests for distributed config (#140319)
- 674d59359d9 - [ROCm] Enable dist sharded_tensor test suites (#137724)
- ecf08a0f8b1 - [ROCm] Enable test_filtering_env_var (#84100)
- f58ba553b78 - [ROCm] Fix distributed tests failure and enable ROCm distributed CI (#92932)
- 0a4e4de525a - [ROCm] add case for FP32MatMulPattern skip property (#84077)
- 9efca7c0850 - [ROCm] [FakeTensorTest] Enable test_fallback_memory_prop (#85760)
- f5bfa4d0888 - [ROCm] Enable test_multiprocessing tests (#82356)
- 7af3208412c - [ROCm] Enable test_ddp_profiling_torch_profiler (#82749)
- 594652f0e49 - [ROCm]: Enable test_grad_layout_1devicemodule_1replicaperprocess (#82005)
- 70e86b4562e - [test_shape_ops] Increase system memory requirement (#80369)
- 2d354cdc2ac - [ROCm] Enable test_instantiator, test_type_hints (#78633)
- 2bb4fce8b98 - [ROCm] TestGradients: Enable grad and gradgrad (#78401)
- 3ee863cb7c0 - [ROCm] enable test_lobpcg_ortho_cuda_float64 (#78385)
- 81586a6a5ec - ROCm: Enable test_distributed_spawn
- 60e2ee3937d - ROCm: unskip c10 gloo tests
- 6ca8272d46a - [Distributed tests] Add skip for odd world_size condition
- 317b8fa7aef - ROCm: Enable TestUnaryUfuncsCUDA tests
- 26ba7a92975 - ROCm: Enable test_masked_scatter_large_tensor
- da4a95c79a6 - [ROCm] Use hipCUB/rocPRIM scan algorithms for large index support (#68487)
- 70a5113e03f - [ROCm] update Magma for 4.3 release (#65203)
- 8bcf01631a1 - [ROCm] update magma (#62502)
- 64d61901eb3 - [ROCm] Skip test_masked_scatter_large_tensor_cuda (#61313)
- 95c26b28067 - [ROCm] disable test test_Conv2d_groups_nobias for ROCm (#59158)
- fd67088a578 - [Distributed test]Enable ddp_control_flow tests for ROCm (#57159)
- 316804e373d - [test_c10d] Add wait in nccl high priority stream test (#54714)
- ec6a7cace3c - [ROCm] Fix the flaky test test_stream_event_nogil (#53850)
- 0a549f9412e - [ROCm] Disable flaky tests on ROCm (#53192)
- 2cf90982e9b - [TestZeroRedundancyOptimizer] Add multi gpu checker (#53564)
- 506fdf9abfe - [ROCm] disable tests for ROCm 4.0.1 (#51510)
- eb0fe706802 - [distributed_test]Enable disabled ROCm tests. (#50421)
- 7e05d07ca75 - [distributed_test_c10d]Enable disabled ROCm tests. (#50629)
- c115957df08 - [distributed] Provide parameter to pass GPU ID in barrier function (#49069)
- 03abd81b8de - [ROCm] Enable skipped distributed global tests (#48023)
- 1606899dbe9 - distributed_test: Map rank to GPU accordingly (#47898)
Back to home
|