02 | August | 2010 | The Open Library Blog

A while back, Ben Gimpert wrote a guest post for us called Open Library Ore, explaining how he had begun to hack on the massive full text corpus on the Internet Archive, practising various Natural Language Processing techniques to begin to teach machines to glean topics of books by sheer letter crunching. Turns out the elements in the ore are beginning to emerge, particularly in the form of a dataset available for download under Attribution-Noncommercial-Share Alike 3.0 CC license… Please, if you know SQL, why not download the dataset and see what you can find out? We’d love to hear any discoveries you make, perhaps in the comments?

Continue reading →

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

The Open Library Blog

A web page for every book

Daily Archives: August 2, 2010

Open Library Ore: A MySQL data dump is available