Monthly Archives September 2007

Data transformation tools

I have been using data integration and transformation software for almost a decade now and have used packages that manage dataflow for very different use cases. Simulink, a product from the MathWorks, manages data flow in the context of doing simulations of control systems and digital signal processing systems. Products like Ascential DataStage (now called [...]

New Manning-Schutze book (co-authored with P. Raghavan)

I just came across the new Manning-Schutze book (authors of “Foundations of Statistical NLP” (FSNLP) that I mentioned in my previous post).  They have co-authored this one with Prabhakar Raghavan (Verity, Yahoo, Stanford, etc.).  The book is titled “Introduction to Information Retrieval.”  It is still in pre-print and you can obtain an electronic version of [...]

Hacking natural language processing (NLP) components

I have been diving deeper into NLP-related code than I have before. I have been using GATE as a platform for doing experiments. I chose GATE as opposed to some of the other platforms out there (e.g. UIMA, RapidMiner) since it is fairly actively developed, has a good set of built-in modules, and has a [...]