HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Applying algorithms on large data

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
applyinglargedataalgorithms

Problem

Is there any book or tutorial that teaches us how to efficiently apply the common algorithms (sorting, searching, etc.) on large data (i.e. data that cannot be fully loaded into main memory) and how to efficiently apply those algorithms considering the cost of block transfer from external memory ? For example, almost all algorithm textbooks say that B and B+-trees can be used to store data on disk. However, actually how this can be done, especially handling the pointers where the data is present on disk is not explained. Similarly, though many books teach searching techniques, they do not consider data present in secondary memory.

I have checked Knuth's book. Although it discusses these ideas, I still did not understand how to actually apply them in a high-level language. Is there any reference that discusses these details?

Solution

Database books are good example. However, have a look at the field I/O efficient data structures (and algorithms). To my knowledge, there are some courses about this topic, but very few books.

Check this book: U. Meyer, P. Sanders, and J. Sibeyn (eds.), Algorithms for Memory Hierarchies, Lecture Notes in Computer Science 2625, Springer, 2003.

Check these courses:
http://www.win.tue.nl/~hermanh/teaching/2IL35/
http://www.daimi.au.dk/~large/ioS12/

and these slides:
algo2.iti.kit.edu/sanders/courses/algen09-10/rdslides.pdf

Context

StackExchange Computer Science Q#3287, answer score: 2

Revisions (0)

No revisions yet.