Is this a good way to implement a search feature in my Rails application that uses dbpedia and SPARQL? Is there a better way to do this? -
Is this a good way to implement a search feature in my Rails application that uses dbpedia and SPARQL? Is there a better way to do this? -
i'm trying set "movie search" application using ruby on rails 3. i'm pulling info dbpedia using sparql (rdf , sparql/client). want potential user able search movie, view results, , click view page generate on film contains more info (both dbpedia , own local database).
this first time using huge info set , sparql , i've noticed it's slow, , guess can't helped. still much utilize info source though.
i have rails app set utilize mongodb, thinking can utilize cache of dbpedia info users don't need wait query every single time. i'm stuck on best way implement this. current thought along these lines:
on first search ever, store details each result in local database (probably basic film info such title, overview, year, alternate titles)
when user search, next occurs:
run search query on local database relevant stored movies (searching title , overview only, likely). if film hasn't been updated dbpedia in past x days, don't include it. quickly display relevant local results user , create list of movies. while user views stored results, dbpedia gets queried. query result create list of relevant results dbpedia. i remove movies dbpedia query result set in initial local result set prevent user seeing duplicate results. i display remaining dbpedia query results underneath local results, , save each of new non-stored results in local database (including last_updated time, , updating existing local items needed). when user clicks through film page, basic info dbpedia , info storing stored locally , can pulled on page quickly, more advanced info (director, language, location, links relevant sites) queried dbpedia @ time of loading. show loading dialogs etc. on different sections while new info retrieved.i thinking of doing above user can see few results while remaining results loaded dbpedia, , storing things not insane amount.
but wanted help on whether realistic , whether idea. can imagine searching local db first might skew user's initial results towards things have been searched before, , if particular desired film (if set in title example) hasn't been searched before might show farther downwards list. create more sense store re-create of relevant info set (i.e. movies) locally , update needed? much, right?
anyway appreciate suggestions on way create things seamless possible user while still dwelling within boundaries of sanity. in advance!
edit: here code test search query using. thought making super super basic testing... times out lot.
query = " prefix owl: <http://www.w3.org/2002/07/owl#> prefix xsd: <http://www.w3.org/2001/xmlschema#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/> prefix dc: <http://purl.org/dc/elements/1.1/> prefix : <http://dbpedia.org/resource/> prefix dbpedia2: <http://dbpedia.org/property/> prefix dbpedia: <http://dbpedia.org/> prefix skos: <http://www.w3.org/2004/02/skos/core#> prefix dbo: <http://dbpedia.org/ontology/> select ?subject ?label ?abstract ?runtime ?date ?name { {?subject rdf:type <http://dbpedia.org/ontology/film>} union {?subject rdf:type <http://dbpedia.org/ontology/televisionshow>}. optional {?subject dbo:runtime ?runtime}. optional {?subject dbo:releasedate ?date}. optional {?subject foaf:name ?name}. ?subject rdfs:comment ?abstract. ?subject rdfs:label ?label. filter((lang(?abstract) = 'en') && (lang(?label) = 'en') && regex(?label, '" + str + "')). } limit 30 " result = {} client = sparql::client.new("http://dbpedia.org/sparql") result = client.query(query).each_binding { |name, value| puts value.inspect } homecoming result
what sparql query using query dbpeid?. should possible optimise improve performance. should able filter using category uri's. should able utilize offset , limit projections cut down number of results. if using total text searchs might consider using virtuoso specific 'bif:contains' property since bit quicker regex filters, although has downside of beingness non-standard / virtuoso specific. addiotnally, can utilize http caching improve subsequent search results (sparql protocol operates on http unsurprisingly).
other that, instead of putting stuff mongo db, might seek utilize own triplestore , load movies dbpedia each night.
edited based on provision of query
ok trial , error, next patterns causing big problems:
?subject rdfs:comment ?abstract. ?subject rdfs:label ?label. filter((lang(?abstract) = 'en') && (lang(?label) = 'en') && regex(?label, '" + str + "')).
filters can slow, without filters query times out. have been more concerned optional clauses (optional can slow). seek wihtout. might need run separate query abstracts , labels.
ruby-on-rails ruby-on-rails-3 caching mongodb sparql
Comments
Post a Comment