We are the Fluminense group, bunch of graduate students at UMD Computer Science Dept.Our project deals with the Google N Gram data.
As of now we are actively pursuing experiments on how we can manipulate the Google N Gram data to gain insightful trends. Ideally, this should help users to use the data in AI applications and searching for specific word co-occurrences in a large data corpus.Once the project is completed the source code and executables will be available for download.
The program is being created under the GNU Public License.
Niwa, Y. and Nitta, Y., Co-occurrence vectors from corpora versus distance vectors from dictionaries
Edmonds, P., Choosing the Word Most Typical in Context Using a Lexical Co-Occurrence Network
Ferret, O. Discovering word senses from a network of lexical co-occurrences.
Higinbotham, D. Semantic co-occurrences networks.
Veling, A. and van der Weerd, P. Conceptual grouping in word co-occurrence networks.
Report Presentation Code DownloadsProject Report
Project GuideTed Pedersen
Ankur             Sarika         Prafulla       Aneerudh