Sunday, February 13, 2011

C vs Python vs Perl sort comparison

First findings are:
  1. C is way faster in sorting then python.
  2. Python code is way shorter then C.
  3. Python is faster than Perl and less memory hungry.

Python source code: Python sort code
Perl source code: Perl sort code

C code was taken from here.
Heap sort in C: file_sort_heap.c.
Quick sort in C: file_sort_quick.c


Doc file is here.

Second findings are:
Python code could be a memory eater.
Explanation is here and here. For every string in the list is allocated 80 bytes=2288 MB of RAM for 30,000,000. Solution would be to use smarter data structure in the python (standard int is 24 bytes).
With the C we have 2 bytes short int which is little bit less :).

Python sorting file with 30,000,000 numbers took around 2-3GB of RAM (file with numbers was 150 MB). Which is too much memory. Solution is to use special sorting for big files for example something like here or C.

With perl I couldn't sort the 30,000,000 numbers file because of out of memory problems (4GB RAM and 2GB swap).

It'd be interesting to try python with psyco module. Unfortunately the rpm isn't present in the Fedora 14.