Advanced - Powered by Google


   
Log In
New Account
  
 
Home
My Page
Project Tree
Project Openings
WebLab
          
 
 
Summary
Lists
Docs
News
Files
          
 
Posted By: Arnaud Saval
Date: 2012-04-19 14:08
Summary: First WebLab Bundle Released

This bundle can crawl a local folder (toIndex) in order to analyze text based documents, index them to finally offer access to them through a portal.
The processing capabilities are limited (only default rules for the named-entity extraction engine are used) but it allows to have a complete processing chain and ease integration and test of new components either on processing chain or on user interface.
This bundle is regularly released (http://weblab-project.org/index.php?title=Download) and build nightly with latest services/portlets (see http://bamboo.ow2.org/browse/WEBLAB-BUNDLE).

This bundle presents an information retrieval system based on the complete WebLab architecture.
It is mainly composed of the following WebLab services:
- an homemade folder crawler able to listen and crawl the content of a given folder (http://weblab-project.org/index.php?title=Folder_Listener),
- a normaliser that will extract the text content of various files (ms-office, pdf, rtf, etc.) based on Apache Tika (http://weblab-project.org/index.php?title=Normaliser_using_Tika),
- a named entities extraction service that detects words in the document and annotate it in documents, based on gazetteer (http://weblab-project.org/index.php?title=Simple_Gazetteer),
- an indexer that will index the text content and make it searchable based on Apache SOLR (http://weblab-project.org/index.php?title=Solr_Indexer/Searcher_WebLab_Web_Service).

In addition to these services, we can found some technical services.

The demo also contains a WebLab chain,
- that chains the previously mentioned services;

and four WebLab portlets:
- a launchCrawl portlet that will launch and monitor the processing of documents with the chain,
- a search portlet that will launch query on the SOLR searcher,
- a result portlet that display the results of the query,
- a annotated document portlet that display the document annotated with the annotation added by the named entities extraction service.

Latest News
XWiki Enterprise and Enterprise Manager 5.0.2 Released
    Thomas Mortagne - 2013-05-24 17:55
XWiki Enterprise and Enterprise Manager 4.5.4 Released
    Sergiu Dumitriu - 2013-05-23 22:18
Talend Open Studio for Data Integration 5.3.0 available
    Patrick Coffre - 2013-05-07 17:53
XWiki Enterprise and Enterprise Manager 5.0.1 Released
    Thomas Mortagne - 2013-05-07 15:29
XWiki Enterprise and Enterprise Manager 5.0 Released
    Thomas Mortagne - 2013-05-03 19:32

Discussion Forums: First WebLab Bundle Released

Start New Thread Start New Thread | Admin

 

Topic Topic Starter Replies Last Post
   

Copyright © 1999-2008, OW2 Consortium | contact | webmaster.