
We listen. It's how we provide solutions that have lasting
relevance to a client's business.
|
|
Case Studies --> Pioneer in providing comprehensive marketplace for multi-channel shopping and advertising
Advertisement Extraction System for converting Print Ads to Online Ads.
Our client provides services to convert print Ads to online Ads to most of the top retailers in USA.
These Ads which are seen at
retailers sites are also hosted by our cleint.
Business Problem - Streamline the current cumbersome and manual effort required in
the extraction of data and images from PDF files and creation of hotspots, with,
a semi-automated procedure supported by this Radicle developed application.
A proof of concept was created in an initial 3 week analysis phase. Based on the
understanding from this 3 week effort, a rules based extraction process was employed
in which user specified rules were used to “understand” and automatically extract
and parse unstructured data in PDF files and save that information in separate
fields in a database, for subsequent access by multiple applications. This process
also extracts images and creates hotspots, thus reducing the work load of the
graphics team. Further, to improve system performance, the application uses low
resolution PDFs for the client front-end application. The original high quality
images at the server backend are then extracted from high resolution PDF files
using the image information gathered during extraction process.
Key system features include:
- Rule Engine – Users can create rules based on text attributes, keywords
and patterns for different listing fields and specify formatting to parsed text.
- Text Extraction - Users can apply retailer or promotion specific
rules to extract, parse and format text and Images for a listing from
PDF files.
- Hotspotting – As part of listing extraction process, user can choose
to automatically create hotspots for a listing on a PDF page. These hot spots
can be attached to one or more advertisement listings
- Image Extract – Users can extract listing images from a
low resolution PDF and tie one of the images to a listing. The application
extracts a high resolution image from a high resolution PDF on backend through a
batch process based on the image information gathered at front end. It also
provides simple work flow capability to graphics users to correct the images.
- System Integration – Integrated with existing systems so that
application could be in place by the 2006 holiday season without requiring
major changes and training to other existing systems.
Technology
employed consists of .Net, C#, SQL Server, 3rd party PDF extraction library and SQL database.
Further components of Ad Extraction process and system are being
developed using FLEX, web services and Cairngorm Methodology.
|
|