Guided Search
Principle
The primary intent behind guided search is to direct the user in getting accurate or close to intended Search results. This method helps the user to narrow down possibilities by reducing noise or distractions and strategically narrowing down search, in turn reducing frustration. Guided Search can also be extended to keep track of user's search history and to prompt most popular search terms as an auto-suggest feature as well.
Methodology
Logic behind Guided Search is to classify all the courses in an enterprise based on Title and/or Short description of the Course. Classification is performed through IBM Watson's Natural Language Understanding suite. IBM's NLU suite has the ability to analyse the presented text and its metadata to extract concepts, entities and related categories using a simple API framework. The categories are presented as a multi-level hierarchy and can be further tuned and processed.
Categories and entities are fed into Google's keyword planner to extract related keywords. Related keywords will be the most frequently searched keywords related to each category. By this we ensure maximum probability of search hits for items even if the related term is not present in the title or short description of a course.
WorkFlow
Please find below the link to Guided Search Workflow.
https://drive.google.com/file/d/1ph11m0tPFEv5HDb1tcGu1SyktJVsKpER/view?usp=sharing
Technical Specifications
Pentaho ETL performs - Data extraction - Text Clean Up - Generation of NLU's categories using IBM Watson NLU - Generation of related Keywords through Google Keyword Planner
Cassandra Big Data DB - Stores Generated Categories and keywords as tokens
SOLR Collection - Indexes generated tokens and its corresponding Course ID
Python Search Program - Performs search on SOLR index as a multi step process to display category and the Courses for selected category
Sample Input/Output
1. Enter Search Term : abb motors
Subcategory
Business Operations
Hardware
Manufacturing
Software
Metals
Computer certification
Cars
Enter Subcategory :business operations
Courses that match your chosen category are: 6166
6167
6168
2. Enter Search Term: :service engineer
Subcategory
Hardware
Business Operations
Certified pre-owned
energy
Enter Subcategory: energy
Courses that match your chosen category are:
18616 1