Crawling, qualification and provision of mass data with “deecoob insight”

You need large amounts of data and information? These are the most important resources for your strategic decisions, tariff, price or product development, competition analyses and the increase in your sales and revenue. However, efforts and requirements for procurement, analysis and qualification of mass data are constantly increasing. The software platform deecoob insight is your solution. In multi-level processes, deecoob insight collects and processes the desired data and makes it available to you in the required quantity and quality.

Obtaining data and documents

With deecoob insight, you obtain nearly any data, information and document from available internal and external sources. It does not make any difference whether you are looking for several hundreds or millions of data records and documents. External sources include websites, online stores, online catalogs, social networks like Facebook, Instagram, Twitter, Linkedln, Xing as well as services such as Google Maps, Open Street Map, Youtube and blogs. In addition, more than 1,000 daily, weekly or monthly editions of epaper are available for the information search. Internal sources are yor databases, storage, archives, application data from ERP, CRM, CMS as well as information from the intranet, wiki’s and cloud solutions.

As a public rights organization, you need event information, information on organizers, places, artists, entrance fees, date, time of events and much more. In addition, evidence should often be generated.

As a production company, you are interested in prices, discounts, bundles, descriptions, announcements, ratings, delivery options for your products as well as your competitors´products on specific trading platforms.

As a publisher of online publications, you want to automatically monitor your readers´ comments in order to identify and filter hate speech or other discriminatory comments.

As an operator of an online shop, you want to analyze and evaluate ratings and comments of your online customers.

Cleaning, enriching and indexing data

Let us know which specific data and information you need. We determine suitable sources together with you and connect the sources with the deecoob insight platform interfaces. deecoob insight then procures the data for you. Data will be checked during the procurement process and only relevant data will be stored on the platform for further processing. In doing so, duplicates will be identified and removed. Data that is not available in one source will be gathered from various other sources if possible. Thus deecoob insight produces verified, cleaned and enriched data records for you. All data will be stored in a high perfomence environment and completely indexed for later processing.

Analyzing and classifying data

Analysis of an unstructured text is technically and linguistically very demanding. deecoob insight provides the necessary algorithms, procedures and methods for efficient text mining. The purpose is to automatically filter exactly these texts, data and details from millions of records that are relevant to your requirements. The software also identifies and classifies any differentiation in relevance gradiation (e.g. very, medium, not relevant) for you.

deecoob insight uses numerous modern “information retrieval methods” for the best possible classification of data from online and offline sources. These methods are:

• Text geo tagging – Which places are included in the text?
• Text time tagging – Date or time period of the message?
• Text entity tagging – Which persons, teams are mentioned?
• Latend semantic indexing – Which text best goes with the question?
• Advanced text classification – Which texts are relevant to the topic?
• Natural language processing – What is expressed by the vocabulary?
• Named entity recognition – Which names or proper noun are contained in the text?
• Linked data – Are there connections between the texts?

Completing, comparing and filtering data

Our customers receive best results in quantity and quality from deecoob insight. To achieve this, we use several cascading and iterative data and text mining methods on the platform. The data is repetitively compared, filtered and evaluated. Due to this multi-level processing, deecoob insight is able to reduce the number of data records for the customer from initially several million to a few thousand or hundred absolute relevant data. These can then be provided with scores and offered as primary, secondary and tertiary reference. In the following manual qualification, employees can concentrate on primary reference but also deduct aditional relevant information from secondary and tertiary references.

Data visualization and presentation for manual processing

deecoob insight cannot and should not replace the manual examination and qualification by deecoob’s research professionals. Only through the intelligent combination of software based automation and manual processing, can we achieve the data quality that convinces our customers entirely. An ergonomic user interface for manual processing of prequalified data is available to our employees with the deecoob insight clent. Our customers, who carry out the manual qualification themselves, also use this client.

For continuous monitoring of quality and quantity of data, deecoob insight provides individual dashboards. These visualize the platform´s KPls (key performance indicators). Based on this, adjustments and optimizations can be made continuously during the ongoing crawling and qualification process.

Checking and qualifying data manually

deecoob´s research professionals examine each data record classified as “relevant” by the software and check it according to individual customer priority on completeness and correctness. Depending on the requirements, they can combine data records, re-classify them, investigate and add missing data, make assignments or re-assessments and carry out many more tasks. Upon completion of processing, the records and related documents (e.g., evidence) are given the corresponding status.

Producing results – modeling data

The results of work carried out by deecoob insight and deecoob´s research professionals can be very different depending on customer requirements. For the deecoob solutions “Copyright Observation”, “Content Exploration” or “Data Crawling”, often “only” the qualified and completed data records and documents are given to the customer.

However, complex data models (e.g. benchmarks) must be created in advance for topics such as “Market Monitoring”, “Reputation Reporting” or “Data Analytics”. The deecoob specialists develop appropriate data models together with the client. With deecoob insight and deecoob service, the models are then completed with the required data.

Data provision

The provision of results from deecoob insight and deecoob service is carried out either automatically, partially automatic or manually, depending on the customers´ it systems and requirements. If there is a connection for automatic data exchange between deecoob insight and the customer system, the data and documents can be transfered automatically via data interface. If this is not possible or not required, it may be necessary for deecoob service to manually forward the data into the customer systems or transfer them via CSV file. Data models (e.g. benchmarks) are provided once as a presentation or in real-time on dashboards.

Test deecoob insight now.