Instructions…
For this assignment you will write your own version of a merge sort and a bubble sort — sorting files of numbers.
Your programs should read a file integers into an array of integers, sort them from least to greatest, then print the sorted list out.
(note: you do not have to sort “in-place” — you can sort into a newly allocated array)
For you program to allocate a large enough array up front, you may want to have the first number in your data files be the count of how many random numbers are in the file.
…Or it could be a command line parameter to your program.
Your version of these programs should include keeping count every time your program compares two items from the list.
Print the count of comparisons when the programs ends.
It is up to you how you keep the “count of comparisons” separated from the sorted output list.
On Linux, you might use stdout and stderr…
…or you might give filenames as command line args to your programs, and the only stdout form your program will be the stats on how many comparison steps were used to sort the list.
…or you might have a command line arg with a filename specifically to **append** stats to
Write automated script(s) to run the tests of your programs against files of various sizes.
keep in mind the point of these scripts is to make it easier to re-run tests as needed and collect data in an organized, easy-to-use form
make your scripts as useful as you need them to be to save yourself time
Example: perhaps give the script command line args to control the range of file sizes to tests…
…so you you can re-run only tests that need to be re-run.
Input file sizes should be every power of two from 21 to at least 226 numbers.
NOTE: your bubble sort will probably be too slow for the largest file sizes.
Once your tests start taking close to a half hour or an hour, maybe it is time to stop, unless you can let it run over night…
File sizes (number of lines, number of random numbers in the list) should be:
2, 4, 8, 16, 32, 64….. 1024, 2048, 4K, 8K, 16K….. 1048576, 2M, 4M, 8M… 32M, 64M
that is the number of lines — the number of random numbers in each list…
…files sizes might each be one line longer ifyou put the size on the first line
Try to go larger if your environment can handle it and the sorts are not taking too long
Create Plots: (see tip on controlling axes in Octave)
# of Comparison Steps:
Merge vs. Bubble — Small Files
choose a range for x & y axes that lets you easily compare the two while they are still somewhat close
x from 0..32 or 64 probably about right
choose a range for y that fits
Merge vs. Bubble — large files
now do the full range of file sizes (can still start at x axis at 0)
at this point you should see that one of them almost disappears compared to the other
Time:
Merge v Bubble, small range
Same basic idea as with the # of comparison steps
For the smallest files, the true time is likely too small to easily measure
Choose a large enough range a that at least a few results of both methods are > 1 ms
Merge v Bubble, large range
Same basic idea as with the # of comparison steps