Reverse Search with Elasticsearch

I am working on a Rails application that allows construction companies to manage a portfolio of projects (AscribeHQ.com). It also has the ability for other users to create a “Group Portfolio” to display projects that were uploaded by other users, based on a potentially complicated criteria. (ex. ‘City of Chicago Group Portfolio’ wants to show construction projects with budgets over $250k and are located within the city limits)

The problem

A Group Portfolio searches the existing projects to show the user which projects they can display, which we are using accomplishing with Elasticsearch. The real challenge is, we want the company user who uploads a new project to find all the Group Portfolios that matches their new project. Essentially we need a “Reverse Search”.

The solution

Elasticsearch has an amazing feature called Percolation, which allows us to save the complicated searches of the Group Portfolio into an index, then when a new project is added, we can ‘search the searches’ to return the Group Portfolio ID. Even better, there is a ruby gem called Tire that supports this percolation feature.

I’ll not go into all the set up of Elasticsearch itself as there are many great blog post on that part already, but this is how we set up the Percolation using the Tire gem:

UserProject model:

class UserProject < ActiveRecord::Base
  include Tire::Model::Search
  include Tire::Model::Callbacks

  # Magic method that returns all the search that match the new project
  def find_matching_groups
    Portfolio::Group.find(UserProject.index.percolate(self))
  end

  # The normal mapping that is required to set up ElasticSearch
  mapping do
    indexes :title,            :as => 'title', :boost => 2
    indexes :custom_title,     :as => 'custom_title', :boost => 2
    indexes :owner,            :as => 'owner.company_name', :boost => 2
    indexes :subtitle,         :as => 'subtitle'
    indexes :street,           :as => 'street'
    indexes :city,             :as => 'city'
    indexes :state,            :as => 'state'
    indexes :zip,              :as => 'zip'
  end

end

Group Portfolio model:

class Portfolio::Group < Portfolio
  include Tire::Model::Search
  include Tire::Model::Callbacks

  after_save do |group_portfolio|
    if group_portfolio.project_criteria.present?
      group_portfolio.save_query
    end
  end

  # Adds and updates query in ElasticSearch database 
  def save_query
    UserProject.index.register_percolator_query(self.id) do |q|
      params = {}

      # project_criteria is saved on the Group Portfolio object.  ex: [{"filter_type": "state", "states": ["MI"]}, {"filter_type": "proj_type", "types": ["28", "29"]}]
      self.project_criteria.from_json.each do |criteria|
        params = params.merge(criteria)
      end

      q.filtered do
        query do
          boolean do
            must { terms :phase_id, params['phases']} if params['phases']
            must { terms :project_type_id, params['types']} if params['types']
            must { terms :green_id, params['greens']} if params['greens']
            must { terms :delivery_method_id, params['delivery_methods']} if params['delivery_methods']
            must { terms :project_definition_id, params['project_definitions']} if params['project_definitions']
            must { terms :state, params['states'].map(&:downcase) } if params['states']
            must { terms :city, params['cities']} if params['cities']
            must { terms :zip, params['zips']} if params['zips']
          end
        end
      end
    end
  end

end

Action returning Group Portfolios:

class Manage::PublishingsController < ApplicationController

  def index
    @project = Project.find(params[:project_id])
    @user_project = UserProject.where(:portfolio_id => current_portfolio.id, :project_id => @project.id).first
    @group_portfolios = Portfolio::Group.where(:id => @user_project.find_matching_groups)
  end

end

Results

We were previously trying to accomplish this same type of results with Delayed Job and some very complicated code. It often took around 5 minutes to do this ‘reverse search’. Now the user sees the results in a half of a second and with simpler code. This is a big win for us and will help us offer better service to our customers.

Applications

There seems to be endless applications for this. On dating sites, a new user can be told how many (and even who) searched for them before they signed up. Auto dealerships could easily see if the car they are could buy matches any recent searches for vehicles on their site. An advertising site could estimate how many views an ad will get based on previous queries. All of these things could be done in other ways, but the code will likely be very complicated and slow.

references: Percolation

jason@collectiveidea.com

Comments