AI in Information Retrieval and Language Processing collected by Wlodzislaw Duch
= superpages. Older version of this page
For general list of AI resources see AI-ML page.
Introductory texts, talks and courses:
Journals, publishers, books
Labs and a few IR/NLP experts
- IR+IE people list | IR+IE working groups list |
- Microsoft NLP | MindNet Project | Conversation: Quartet and Deep Listner |
- Center for Intelligent Information Retrieval, CIIR
- Cognitive Machines, situated cognition, language grounding, Deb Roy at MIT Lab
- Human Language Technology Research Inst, Dallas Uni, S. Harabagiu
- N. Ide (Vassar), semantic web, NLP.
- S. Johnson (Columbia), NLP in biomedical domains.
- Mark Light, IR papers, question answering, biosciences
- Linguistic Modelling Department (LMD), Bulgarian Academy of Sciences.
- Friedman Nir, Bayesian belief networks, learning
- Ontotext, Sirma Group lab for knowledge processing and representation (KR) and linguistics (CL/NLP), Bulgaria | PROTON and KIMO ontologies, upper-level concepts necessary for semantic annotation, indexing, and retrieval.
- James W. Pennebaker, UoTexas, Austin, medical NLP, psychology of word use, Linguistic Inquiry and Word Count (LIWC) software.
- Steffen Staab, Koblenz, Semantic Web
- Y. Wilks (Sheffield), dialogue systems, pragmatics, lexicons.
- UAMIS AI Group , Hsinchun Chen, many interesting projects
- Scott Weiss, Johns Hopkins University, with IR glossary
- Sheffield NLP - many projects
- Wlodarczyk Andre, Sorbonne, semantic investigations project, some linguistic tools.
- Latent Semantic Analysis (Boulder)
- Latent Semantic Indexing (Tennesee)
- WebMining PL - SPSS
Most ambitious IR projects
Wikipedia related projects
Other very ambitious projects:
- Use Wordnet | Wordnet, "an online lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept".
- The Global WordNet Association, a platform for discussing, sharing and connecting wordnets for all languages in the world.
- Euro WordNet, building a multilingual database with wordnets for several European languages.
- Senseval, evaluate Word-Sense-Disambiguation systems with wordnet or wordnet-like resources.
- EUROTERM, extends the EuroWordNet database with specialized terminology
- Polish WordNet, słowosieć.
- BALKANET, EuroWordNet for more European languages: Czech, Romanian, Greek, Turkish, Bulgarian, and Serbian.
- MEANING, EuroWordNet with sense-tagged corpora extracted from the WWW and word-sense-disambiguation modules
- Wordmind - list all uses of a word in Wordnet
- Perl access to WordNet
- Lucene WordNet tools
- Augmented and Sense-clustered Wordnets, Stanford WordNet project, Rion Snow.
- WN Query Data
Semantic Web projects:
Other projects of interest:
- Hamburg Metaphor Database, a project where metaphoric relations in wordnet are explored and encoded: Hamburg Metaphor Database or Hamburger Metapherndatenbank.
- Ai (Israel), Hal, child machine MegaHAL project, started by Jason Hutchens - grow baby brain by talking to it! Current state - 18 month old, makes 2-3 word sentences in proper context, but will it ever grow? This project seems to be dead already.
- CANIS, Interspace, Medspace projects - solving the problem of deep semantics
- ConceptNet, commonsense knowledgebase and natural-language-processing toolkit, from the Open Mind initiative.
- Software Agents MIT Group using ConceptNet | Common Sense questions to collect info.
- CYC project, Cycorp, Inc has NLP components + common sense ontology
- EcoCyc/MetaCyc project, P. Karp, Stanford, is used to build biological cell models from text information
- FASTUS, Finite State Automa-based Text Understanding System (Appelt)
- FrameNet, on-line lexical resource for English, based on frame semantics.
- HAL - High Dimensional Semantic Space | Curt Burgess homepage
- High Dimensional Semantic Spaces
- High Performance Knowledge Bases (HPKB), and other projects of DARPA
- Honda Open Mind Indoor Commons Sense
- HowNet, English and Chinese
- iCub baby robot should learn language at the 2-year's old level.
- LifeNet, First-Person Commonsense.
- Multicentric Technology, computer-aided thinking
- NELL: Never-Ending Language Learning
- Shruti, reflexive reasoning network
- Ontolingua and Chimaera Projects, Stanford.
- Robot Lawyer - free advice, do not pay!
- Standard Upper Ontology Working Group (SUO WG), formal first order logic ontology linked to WordNet |
- Semantic Web |
- The Suggested Upper Merged Ontology portal | Sigma project on SourceForge, tools for SUMO | KSMSA ontology browser (SUMO, Wordnet) |
- The Multi-Source Ontology (MSO)
- National Center for Biomedical Ontology. list of many ontologies, vizualization |
- Open Biomedical Ontologies for bioinformatics |
Agenci: Learning Intelligent Book Recommending Agent | Prody Parrot, personal assistant from MindMaker | Talking Buddy, agent do wszystkiego |
SciAgent for Multidisciplinary Problem Solving Environments | Survey of Cognitive and Agent Architectures |
Virtual humans & avatars
Virtual Friends demos (Haptek) | Virtual Bush (Haptek) |
Pulse 3D Veepers, Virtual Personality
Sitepal flash characters | Speak2me flash |
Oddcast animated characters | AI oddcast | Sports trivia game | Vhost flash characters | Ms Dewey animated character |
CogWorks Virtual People news | Digital Space Worlds | Active worlds | John Lennon | Virtual worlds company | Talking heads in java | Center for Human Modeling and Simulation | Steven Stahlberg gallery | Virtual Humans, Peter Plantec |
Artificial-life, mobile phone bots | Artificial-life Bot-me | their e-learning portal |
Agentland | Chatterbot challenge | Chatbots.org (all world)| InteractiveStory good links | Personality forge, chatterbot collection | Simon Laven bots, many! |
Arthur, with source code| Conversive Application Platform, Agents and dialog system | DELCA Ghosts, specialized | e-brain jako sprzedawca | Icogno, nice examples, based on Jabberwacky | Simon w e-brain|
Talk to Alice and AIML | Botizen | Jabberwacky | Jabberwock | Chat-bot | Nicole | Ramona | SmarterChild | Start (MIT) | Ultrahall Assistant | Artificial Solutions (Kiwi Logic Lingubots) | Intelliwise (nice graphics).
Polish bots (Polske boty): Fido Intelligence (dawniej Kiwilogic) | Cathy (Michal Zalewski) | Inguaris | Paczucha |
Beyond bots: Virtual Personal Assitants, Wiki: Virtual Personal Assistant (VPA) | Amities, Automated Multilingual Interaction with Information and Services |
Marek Kasperski: Aibotworld | Kognitywistyka.net |
Embodied Conversational Agents
Agents, Avatars, Bots, Virtual Humans
: Embodied agent
Hung-Hsuan Huang projects:GECA
(Generic ECA), platform for modular ECA components |
eneration for interactions with embodied conversational agents |
Air Force BayesNet | Int. Soc. for Bayesian Analysis | ASA Section on Bayesian Statistical Sciences | Bayesian networks meta-resource | Bayesian belief networks
Computational biology and bioinformatics applications:
Martin Tompa, Washington (sequences) |
HMM for Proteins | HMM tutorials | Genes and disease (molecular biology book chapter)
: Representing word meaning and order information in a composite holographic lexicon
Emotions from text:
Emotus Ponens, MIT Lab textual affect sensing, based on Concept Net.
SentiStrength, a sentiment analysis (opinion mining) program
IR Web interfaces:
Web Intelligence Consortium (WIC) | European Network for Intelligent Information Interfaces | European Network of Excellence in Text Mining, NEMIS |
UAMIS AI Group , Hsinchun Chen - OOHAY web visualization
IR systems demonstrations:
ISYS search software |
Acrophile, acronym database.
Eliyon, business employes, automatic content generation.
Flipdog, job search.
Knowledge Modeling, Ontology:
Ontology intro (Wikipedia) | Ontology | TOP Ontology Page | DAML Ontology Library | KQML or the Knowledge Query and Manipulation Language
Bufflo group, Barry Smith, mental, geo, all kinds
CMU Web => KB project | WebKB knowledge base servers |
Swoogle search over 10.000 ontologies |
NeuroOK document classification, annotation and search, multilinguial.
Andrew McCallum group, "to mine actionable knowledge from unstructured text", includes Bow, Rainbow and Mallet text classification systems. https://imagenotion.com/
Specific domain ontologies:
ACM Classification Scheme | Animal diversity | Mathematics Subject Classification | Physics and Astronomy Classification Scheme | Wiki classification systems | Library of Congress Authorities, search for subject headings.
Topic Maps: Wikipedia topic maps entry | Topic maps wiki page | XML Topic maps | K42 Topic maps |
Recognizing Textual Entailment: ACL Web Wiki info | Pascal competition
Concept Maps, based on constructivist's learning ideas: Wikipedia concept map entry | Conceptual structures | Conceptual graphs | Concept mapping | IHMC Concept mapping tools | Wiki concept mapping page | Wiki software | Online Course in Knowledge Representation using Conceptual Graphs | TexFlame, from Pubmed to concept maps.
Other Knowledge Maps: Conceptual graphs (Wikipedia) | Conceptual schemas | Facet maps | Knowledge maps (Wikipedia) | Knowledge visualization (Wikipedia) | Mind mapping (Wikipedia) | Semantic networks (Wikipedia) | Semantic web (Wikipedia) | Visual complexity projects | Web trend map |
Leximancer, automatic analysis of document collections, taxonomy discovery.
Knowledge management: Knowledge Computing Corporation, Coplink system |
CCI, Conversational Computing Inc, legal systems, dialog, NLP news/info
ERNEST is a knowledge representation language, based on semantic networks
One Look (870 dictionaries) | 400 On-line Dictionaries | Word reference | Lingo Z |
Wortschatz Lexicon, Uni. Lepizig.
Webster's Unabridged 1913 | WWWebster Thesaurus | Roget's Thesaurus | Roget's Thesaurus - from Project Gutenberg | The Wordsmyth English Dictionary-Thesaurus | Synonym/antonym Dictionary of English |
Eurodicautom (transl. in 8 languages) |
Onet dictionaries | Free dictionary
Medical and biomedical systems:
National Library of Medicine (NLM) NIH: Semantic Knowledge Representation (SKR), NIH | SKR Bibliography |
Unified Medical Language System | UMLS alternative list,
UMLS biography 1986-96 | UMLS 1990-2002 list
MetaMap Transfer (MMTx)
Indexing Initiative Systems
Word Sense Disambiguation (WSD)
Lexical system group, National Library of Medicine
Semantic network, NLM
Medical resources on the web, 2005 update
Medical Reference For Non-Medical Librarians
Medstract, Medline abstract, Brandeis University | Anni, ontology-based interface to Medline |
Cochrane Collaboration, healthcare decision-making through systematic reviews of the effects, Evidence-based Health Care | Cochrane Schizophrenia Group |
BioSemantic Group, Medical Informatics, Leiden University, software and papers | Jane - search for referees |
Health Level 7" XML standard
Health Cyber Map
Knowlet technology, Knewco, Community Annotation and Knowledge Tracking and Discovery.
Language and Computing medical systems
Medical acronyms, over 200K, with 800K definitions!
Medical language processor for automatic text processing and presentation, Sourceforge project.
Morphosaurus, mapping to standardized medical dictionary in any langauge.
Semantic Mining, biomedical, EU network.
Question answering systems:
Wolfram Alpha |
Language Computer, LCC, top 2004 Q/A system |
MIT Start project |
My[Q]Box, providing question answering software for Web pages.
Watson Natural language understanding demo |
NLP Tools: Taggers, Parsers, NER, NP chunking, Language models ...
NLP Stanford tools links |
English Collocations - ARCS | Oxford Collocations Dictionary | Phrases in English (BNC) |
Probabilistic parsing links |
Link Grammar | Link Grammar Parser |
Living human digital library, services + software for research community in biomechanical modelling, sharing datasets, models etc.
Memory-based tagger |
Multilingual Statistical Parsing Engine, Dan Bikel |
NLP Software Registry (at DFKI) |
MinorThird, Java for text annotation, extraction, categorization (William W. Cohen) |
VisualText, general development enviroment for NLP |
Knewco, has Concept Web semantically enhanced compilations of knowledge from unstructured text, medical concept net navigator.
Knowledge Media Institute (KMi), part of the Open University, current projects and software tools: AquaLog, a portable question-answering system for organizational ontologies | Compendium, a software tool for mapping information, ideas and arguments | CORDER, COmunity Relation Discovery by named Entity Recognition | ESpotter, Adaptive Named Entity Recognition for Web Browsing | FLOR, FoLksonomy Ontology enRichment | The Internet Reasoning Service (IRS) Semantic Web Services framework | KCE - Key Concept Extraction, identifying concepts that summarize what the ontology is about | Magpie, ontology-based semantic markup of web documents | OntoWeaver, an ontology-based approach to web development | PhiloSURFical, Semantically browse a philosophical text | PowerAqua, NLP Interface to the Semantic Web | Revyu, to review and rate anything you want | SemSearch Search Engine for the Semantic Web | KMi semantic web automated semantic data integration | Watson, Exploring the Semantic Web | MnM Ontology Driven Semi-Automatic and Automatic Support for Semantic Web | OCML Operational Conceptual Modelling Language | PlanetOnto news server, facilitates lab-related items | WebOnto, browse and edit knowledge models over the web.
20q net - 20 questions game | 20 questions game | 20 questions about animals | Darth Vader reads your mind |
Edict word games |
Creative Language System Group (UC Dublin), word games |
ESP, Verbosity and other games (CMU), great!
Google image labeler
Interesting Search Systems:
Links to various search engines and catalogs
Carrot2 (Dawid Weiss, Poznan)
UIMA, Unstructured Information Management Architecture (IBM), open platform for building unstructured information and knowledge management applications (released 8/2005).
IBM Public Image Monitoring Solution - what do they say about you, your product or your company?
DFAs: Languages and Learning
IR in Chemistry
IR in Chemistry
Edict Word Frequency Text Profiler, Unique Words Text Profiler and concordancer.
Visualization of texts and documents:
Tools for visualization from CAIDA.org | Visual Browsing in Web and non-Web Databases | Citespace - visualizes evolution of a network | Chaomei Chen info visualization
Nice collection of info (Lijexu)
Infomap - info mapping project (Stanford)
Inxight - drzewo hiperboliczne, hiperbolic tree (commercial)
NeurOK semantic explorer
Semantic Atlas, English and French
Touchgraph, link trees via Web interfaces (free), including Interent links java applet
Thinkmap, taxonomy browsers
WEBSOM - free-text mining | WebSOM, PhD Timo Honkela | Astronomical journals and catalogues: removed.
SOM digital library project |
Internet topology visualizations | The Internet Mapping Project (Lumeta) | Nonlinear Magnification InfoCenter |
Course on info visualization, Fall 2005-2006, Tamara Munzner, UBC; includes software and data links.
Machine translation& NLP:
28 systems and 38 languages: Foreignword.com | EAMT (European Association for MT) |
Center for Machine Translation (CMU) |
Interactive Speech Translation
Microsoft MT projects |
Systran Corporation |
WOCADI, based on Multinet, Hartrumpf, Hagen |
MT intro, book by Doug Arnold |
Language Force | Global English (translations, lessons) | Translation Experts (25 languages ) |
Machine Translation (Kluver Journal) | Subjex dialog systems | Systran translation software | MT Introductory Guide (book!)
START natural language system (SynTactic Analysis using Reversible Transformations)
Try it here! Lernout and Haspie | Alta Vista Babel fish | Free Translation | InterTran (PL + inne) | Reverso, free online translation | Systran | T-mail (europ + ros, jap, kor, chin ...) | WorldLingo's free online translator |
Polish-English translation: Translantica.pl (PWN) | Translate.pl | Słownik polsko-angielski
Polish linguistic resources: Morfologik | Wielki słownik ortograficzny | Computational Linguistics in Poland | Computational Linguistics Research Groups in Poland |
Other topics: APA Style Converter | Safari Books online |
TEMIS, text mining, extration, semantic solutions |
Vantage learning, web-delivered instructional, assessment and professional development programs.
Last modification 07.07.2016, by Wlodzislaw Duch