PyTorch GitHub Commits

Author information

Name: Jagadish Krishnamoorthy
Emails:
- jagadish.krishnamoorthy@amd.com (recent)
- jagdish.krishna@gmail.com (older)
Commits till: 2/10/2026

Commit history

f8454dc4e04 - [ROCm] Enable scaled group mm on gfx950 (#173737)
ea123f27ee0 - CUDAScaledBlas - replace FBGEMM_GENAI with MSLK (#172988)
0f2064707a3 - [ROCm] Unifying hipBLASLt architecture lists into common hook methods (#172791)
65eeb563478 - fix build error
9ae6009bc34 - GroupBlas - Check fnuz type only for gfx942
c118e1fa5b3 - [ROCm] Enable scaled group mm on gfx950
a6dc0642eb4 - [ROCm] Use HIPCachingAllocator for CK argument and workspace buffers (#172311)
a7ee6e41091 - [ROCm] Add unit test to verify grouped GEMM CK opt‑in flag (#171901)
188a1ee7549 - [ROCm] Make grouped GEMM CK opt‑in via env and default to fallback path (#171140)
282d2eb4047 - [ROCm] Refactor ROCm CK config generation into shared helper (#171121)
a69907a41e0 - [ROCm] Make grouped GEMM CK opt‑in via env and default to fallback path (#170159)
62985304339 - [ROCm] inductor/fp8 test: Check for "cuda" in device type. (#170254)
5058132088b - [ROCm] Enable group gemm on gfx90a (#169356)
4887c46900e - [ROCm] Fix HIP document url. (#168220)
f9b81e23e46 - [ROCm] Disable group gemm CK path when composable kernel (CK) is not enabled (#167403)
dc00842b81b - [ROCm][CI] trigger magma build with gfx950 for ROCm7.1 (#167390)
32d30d96cf2 - [ROCm][CI] unconditionally add gfx950, gfx115x to PYTORCH_ROCM_ARCH (#167299)
af829c0dade - [ROCm] Skip nvfp4 tests on ROCm (#167066)
c17aa0f1130 - [ROCm] Enable group gemm through CK (#166334)
1fa520ea654 - [ROCm] Enable group gemm through CK (#166334)
34ed7a8f0d1 - [ROCm] Skip test_blockwise_nvfp4_with_global_scale (#165968)
8951df03ded - test_scaled_matmul_cuda: fix infer_scale_swizzle (#165788)
7669ac94028 - [ROCm] Add scaled_mm v2 support. (#165528)
c7e30ae4dd9 - MX: Remove redundant PLATFORM_SUPPORTS_MX_GEMM constant (#164320)
264e7f68a09 - [ROCm] Fix mx fp8 and fp4 code after scaling refactor changes. (#163127)
8bc4a467a7c - [ROCm] test_aot_inductor: Enable fp8 tests. (#163050)
01c3c891c19 - [ROCm] Enable test_fixed_striding (#162787)
6944d4b6397 - [ROCm] rocblas Aten GEMM overload for FP32 output from FP16/BF16 inputs (#162600)
a8d6943d36c - ROCm: Enable overload tests from test_matmul_cuda (#161540)
d2b8c0d431e - forward fix of #152198 (#161166)
543896fcf33 - test_matmul_cuda: Refine MX test skipping (#161009)
0d99b4e9e29 - ROCm: Enable tf32 testing on test_nn (#148945)
6fa1b171955 - ROCm: Add trailing comma for consistency in gfx architecture list (#150250)
ed9c8a5d136 - ROCm: Disable torch check for Multiplication of two Float8_e5m2 matrices (#148228)
0ea5d1067bc - ROCm: Remove static specifier for allow_tf32 variable. (#147186)
17e05cde0c4 - ROCm: Skip tests in elastic/utils/distributed_test (#144692)
8f3eb843730 - ROCm: Enable 4 gpu tests for distributed config (#140319)
674d59359d9 - [ROCm] Enable dist sharded_tensor test suites (#137724)
ecf08a0f8b1 - [ROCm] Enable test_filtering_env_var (#84100)
f58ba553b78 - [ROCm] Fix distributed tests failure and enable ROCm distributed CI (#92932)
0a4e4de525a - [ROCm] add case for FP32MatMulPattern skip property (#84077)
9efca7c0850 - [ROCm] [FakeTensorTest] Enable test_fallback_memory_prop (#85760)
f5bfa4d0888 - [ROCm] Enable test_multiprocessing tests (#82356)
7af3208412c - [ROCm] Enable test_ddp_profiling_torch_profiler (#82749)
594652f0e49 - [ROCm]: Enable test_grad_layout_1devicemodule_1replicaperprocess (#82005)
70e86b4562e - [test_shape_ops] Increase system memory requirement (#80369)
2d354cdc2ac - [ROCm] Enable test_instantiator, test_type_hints (#78633)
2bb4fce8b98 - [ROCm] TestGradients: Enable grad and gradgrad (#78401)
3ee863cb7c0 - [ROCm] enable test_lobpcg_ortho_cuda_float64 (#78385)
81586a6a5ec - ROCm: Enable test_distributed_spawn
60e2ee3937d - ROCm: unskip c10 gloo tests
6ca8272d46a - [Distributed tests] Add skip for odd world_size condition
317b8fa7aef - ROCm: Enable TestUnaryUfuncsCUDA tests
26ba7a92975 - ROCm: Enable test_masked_scatter_large_tensor
da4a95c79a6 - [ROCm] Use hipCUB/rocPRIM scan algorithms for large index support (#68487)
70a5113e03f - [ROCm] update Magma for 4.3 release (#65203)
8bcf01631a1 - [ROCm] update magma (#62502)
64d61901eb3 - [ROCm] Skip test_masked_scatter_large_tensor_cuda (#61313)
95c26b28067 - [ROCm] disable test test_Conv2d_groups_nobias for ROCm (#59158)
fd67088a578 - [Distributed test]Enable ddp_control_flow tests for ROCm (#57159)
316804e373d - [test_c10d] Add wait in nccl high priority stream test (#54714)
ec6a7cace3c - [ROCm] Fix the flaky test test_stream_event_nogil (#53850)
0a549f9412e - [ROCm] Disable flaky tests on ROCm (#53192)
2cf90982e9b - [TestZeroRedundancyOptimizer] Add multi gpu checker (#53564)
506fdf9abfe - [ROCm] disable tests for ROCm 4.0.1 (#51510)
eb0fe706802 - [distributed_test]Enable disabled ROCm tests. (#50421)
7e05d07ca75 - [distributed_test_c10d]Enable disabled ROCm tests. (#50629)
c115957df08 - [distributed] Provide parameter to pass GPU ID in barrier function (#49069)
03abd81b8de - [ROCm] Enable skipped distributed global tests (#48023)
1606899dbe9 - distributed_test: Map rank to GPU accordingly (#47898)

Back to home