Product Details

FITLayout Web Page Segmentation Framework

Created: 2014

Czech title
FITLayout rámec pro segmentaci webových stránek
Type
software
License
In order to use the result by another entity, it is always necessary to acquire a license
License Fee
The licensor does not require a license fee for the result
Authors
Burget Radek, doc. Ing., Ph.D. (DIFS)
Milička Martin, Ing.
Keywords

web page segmentation, document analysis, text classification, web page rendering

Description

FitLayout is an extensible web page segmentation framework written in Java. It defines a generic Java API for representing a rendered web page and its division to visual areas and it provides a base for implementing page segmentation algorithms with a common application interface. As a sample segmentation method, it implements a previously published segmentation algorithm based on recursive visual area merging and separator detection. The framework includes tools for post-processing the segmentation result by different text or visual classification methods. Finally, it also provides tools for controlling the segmentation process and examining the segmentation results through a graphical user interface. The segmentation result may be stored as RDF data for later analysis.

Location
License Conditions

Free software under the terms of the GNU GPL license.

Projects
The IT4Innovations Centre of Excellence, MŠMT, Operační program Výzkum a vývoj pro inovace, ED1.1.00/02.0070, 2011-2015, completed
Výzkum pokročilých metod ICT a jejich aplikace, BUT, Vnitřní projekty VUT, FIT-S-14-2299, 2014-2016, completed
Research groups
Departments
Back to top