These are the profiling data from my Athlon 1.4GHz. Notice the *huge* speedups when multiplying large (non-integer) matrices. Explanation: ------------ dotblas.dot == BLAS dot with python wrapper Numeric.dot == matrix product with its Numeric wrapper interpretation of the timings: for each entry a*b is called 1000 times a*A is called 100 times A*A is called 10 times TYPECODE: D =========== Function | a*b (10x1 * 10x1) | A*a (10x10 * 10x1) | A*B (10x10 * 10x10) -----------+--------------------+--------------------+-------------------- dotblas.dot| 0.00631 | 0.00084 | 0.00017 -----------+--------------------+--------------------+-------------------- Numeric.dot| 0.00604 | 0.00073 | 0.00019 -----------+--------------------+--------------------+-------------------- Function | a*b (100x1 * 100x1) | A*a (100x100 * 100x1) | A*B (100x100 * 100x100) -----------+------------------------+------------------------+------------------------ dotblas.dot| 0.00712 | 0.00635 | 0.06226 -----------+------------------------+------------------------+------------------------ Numeric.dot| 0.00703 | 0.00912 | 0.11361 -----------+------------------------+------------------------+------------------------ Function | a*b (1000x1 * 1000x1) | A*a (1000x1000 * 1000x1) | A*B (1000x1000 * 1000x1000) -----------+----------------------------+----------------------------+---------------------------- dotblas.dot| 0.01357 | 2.99075 | 40.56162 -----------+----------------------------+----------------------------+---------------------------- Numeric.dot| 0.01310 | 4.70348 | 895.69399 -----------+----------------------------+----------------------------+---------------------------- TYPECODE: l =========== Function | a*b (10x1 * 10x1) | A*a (10x10 * 10x1) | A*B (10x10 * 10x10) -----------+--------------------+--------------------+-------------------- dotblas.dot| 0.02385 | 0.00244 | 0.00032 -----------+--------------------+--------------------+-------------------- Numeric.dot| 0.00551 | 0.00069 | 0.00013 -----------+--------------------+--------------------+-------------------- Function | a*b (100x1 * 100x1) | A*a (100x100 * 100x1) | A*B (100x100 * 100x100) -----------+------------------------+------------------------+------------------------ dotblas.dot| 0.02398 | 0.00678 | 0.04120 -----------+------------------------+------------------------+------------------------ Numeric.dot| 0.00605 | 0.00475 | 0.04067 -----------+------------------------+------------------------+------------------------ Function | a*b (1000x1 * 1000x1) | A*a (1000x1000 * 1000x1) | A*B (1000x1000 * 1000x1000) -----------+----------------------------+----------------------------+---------------------------- dotblas.dot| 0.02854 | 1.38042 | 453.25659 -----------+----------------------------+----------------------------+---------------------------- Numeric.dot| 0.00958 | 1.46957 | 452.55097 -----------+----------------------------+----------------------------+---------------------------- TYPECODE: d =========== Function | a*b (10x1 * 10x1) | A*a (10x10 * 10x1) | A*B (10x10 * 10x10) -----------+--------------------+--------------------+-------------------- dotblas.dot| 0.00570 | 0.00080 | 0.00012 -----------+--------------------+--------------------+-------------------- Numeric.dot| 0.00546 | 0.00070 | 0.00015 -----------+--------------------+--------------------+-------------------- Function | a*b (100x1 * 100x1) | A*a (100x100 * 100x1) | A*B (100x100 * 100x100) -----------+------------------------+------------------------+------------------------ dotblas.dot| 0.00567 | 0.00327 | 0.03053 -----------+------------------------+------------------------+------------------------ Numeric.dot| 0.00578 | 0.00467 | 0.04665 -----------+------------------------+------------------------+------------------------ Function | a*b (1000x1 * 1000x1) | A*a (1000x1000 * 1000x1) | A*B (1000x1000 * 1000x1000) -----------+----------------------------+----------------------------+---------------------------- dotblas.dot| 0.00719 | 1.46835 | 10.12871 -----------+----------------------------+----------------------------+---------------------------- Numeric.dot| 0.00840 | 2.17318 | 595.59559 -----------+----------------------------+----------------------------+---------------------------- TYPECODE: f =========== Function | a*b (10x1 * 10x1) | A*a (10x10 * 10x1) | A*B (10x10 * 10x10) -----------+--------------------+--------------------+-------------------- dotblas.dot| 0.00728 | 0.00068 | 0.00012 -----------+--------------------+--------------------+-------------------- Numeric.dot| 0.00679 | 0.00068 | 0.00015 -----------+--------------------+--------------------+-------------------- Function | a*b (100x1 * 100x1) | A*a (100x100 * 100x1) | A*B (100x100 * 100x100) -----------+------------------------+------------------------+------------------------ dotblas.dot| 0.00720 | 0.00256 | 0.00534 -----------+------------------------+------------------------+------------------------ Numeric.dot| 0.00705 | 0.00411 | 0.03423 -----------+------------------------+------------------------+------------------------ Function | a*b (1000x1 * 1000x1) | A*a (1000x1000 * 1000x1) | A*B (1000x1000 * 1000x1000) -----------+----------------------------+----------------------------+---------------------------- dotblas.dot| 0.01009 | 0.74370 | 6.53564 -----------+----------------------------+----------------------------+---------------------------- Numeric.dot| 0.00969 | 1.18666 | 415.52507 -----------+----------------------------+----------------------------+---------------------------- python profileDot.py 2638.80s user 73.46s system 93% cpu 48:19.31 total