L_MAX=8; BATCH_SIZE=1000; N_FEATURES=2000
preparing real life transformation rule
transformation rule is computed
*************
CPU BENCHMARKS
Running on 72 threads
*************
***forward***
python loops; active dim 0; forward; cpu: 0.287917349073622
torch index_add_; active dim 0; forward; cpu: 0.2882208559248183
cpp; active dim 0; forward; cpu: 0.06418042712741429
python loops; active dim 1; forward; cpu: 0.10099238819546169
torch index_add_; active dim 1; forward; cpu: 0.19646917449103463
cpp; active dim 1; forward; cpu: 0.015409390131632486
python loops; active dim 2; forward; cpu: 1.13313627243042
torch index_add_; active dim 2; forward; cpu: 0.9349785645802816
cpp; active dim 2; forward; cpu 0.029056257671780057
***backward***
python loops; active dim 0; backward; cpu 8.56085040834215
torch index_add_; active dim 0; backward; cpu 0.8768206967247857
cpp; active dim 0; backward; cpu 0.14745905664232042
python loops; active dim 1; backward; cpu 12.528574811087715
torch index_add_; active dim 1; backward; cpu 1.3579767015245225
cpp; active dim 1; backward; cpu 0.11550368203057183
python loops; active dim 2; backward; cpu 1.43605547481113
torch index_add_; active dim 2; backward; cpu 1.3703345987531874
cpp; active dim 2; backward; cpu 0.05493460761176215