HTML Stripper New Features
Like WebToData the newest and latest version of HTML Stripper has Data Sensing technology. This clever technology adds artificial intelligence abilities to the program.
HTML Stripper can now analyse the contents of a web page and sense the data in the page. Data is then structured in a comma delimited format ready to use in your favourite database. The program does this all automatically!
To gain the benefits of this new technology all you have to do is order a registered copy of HTML Stripper.
HTML Stripper can process multiple files in one batch. The program can process many files and create a corresponding file for each processed file or it can process many files and create one file containing the results of all selected files.
For database creators this means HTML Stripper can process many html files and create one database file. This feature is only available in registered version.
This is a new clever technology developed by GLTSoft.
The design of most web pages is based on HTML tables. Tables are used to structure the layout of pages and to present information to web users. This creates a difficulty for any program that attempts to extract data from web pages - how to find worthwhile data? Our research has found that expensive programs that claim to have the ability to work with data found in web pages are not successful.
GLTSoft have developed an Artificial Intelligence engine that is integrated into HTML Stripper and WebToData. This engine analyses the data found in web pages, extracts it and then presents the data ready to be saved to file in a database format.
For example the following table would be found by HTML Stripper and analysed by the Data Sensing engine.
Once analysed, the data would be presented in the following format ready to save.
Name, Street, Town,
Bob Smith, 12 Wood St, Miles
Jan Albert, 44 Station Rd, Oakton
Dianne Wong, 33 Range Rd, Clifton
Jack Dinnis, 19 Cutting Rd, Temby
This new innovation is called data sensing technology. But it is far more than working with simple tables, such as the one presented above, and extracting data from them. This technology is designed to work with highly complex web pages typically containing multiple tables. Many of these are used to structure the layout of the page and therefore do not contain useful data. Our Data Sensing technology senses data and extracts information from the page that will most likely import directly into a database.
Naturally HTML Stripper provides a full copy of all data in another window so that a full comparison can be made by the operator of the program.
Web pages are designed by humans for humans. Even if the data is coming from a web server, the formatting of the data is controlled by the designer of the web page. This means that the formatting of data does not always follow logically defined definitions.
To extract data from these sorts of web pages, we have now added 'User Controlled Data Definitions' to HTML Stripper Pro.
This feature lets you customise how HTML Stripper searches for data in web pages.
Data Pattern Matching is the next step towards giving users of HTML Stripper ultimate control over extracting data from web pages. Web page designers are using more creative license when they create pages based on databases. It is not uncommon to see data, that essentially should be one record, spread over a number of lines on a web page. If you have data formatted in this way then you have data that can be found by Data Pattern Matching and rebuilt as a comma delimited database. It is a powerful feature that makes HTML Stripper one of the most useful software programs available.
Need extra database creation power?WebToData may have the features you need. WebToData uses the features of HTML Stripper as the foundation for extended and more powerful database creation tools. Check out WebToData for further details.