Seeks Configuration

From Seeks

Jump to: navigation, search

Seeks has several configurable elements,

  • the proxy,
  • the websearch plugin,
  • the image websearch plugin.


Contents

Proxy configuration

The proxy is a hack of Privoxy so the configuration file looks similar.

Websearch plugin configuration

The websearch plugin configuration file is

src/plugins/websearch/websearch-config

All the following configuration options and their default values are to be found in this file.

websearch language

The option search-language defines the websearch language of preference. Default: en

search-language fr

Sets the language to French.

  1. automatic detection is based on http headers.
  2. default: auto

number of results per page

Maximum number of websearch results on a single page. Default: 10

search-results-page 10

websearch engine selection

The option search-engine allows the selection of a set of search engines to get results from. Currently, Google, Bing, Yahoo, Cuil, and Exalead are supported and activated by default.

search-engine google
search-engine bing
search-engine cuil
search-engine yahoo
search-engine exalead

websearch cache expiration (seconds)

Minimum number of seconds search results are kept in the system cache, for reuse, update, etc... while not being used. The cache is per query, and is resetted every time an alive query is accessed. Default: 300

query-context-delay 300

enabling thumbnails

The following option enables the insertion of thumbnails from http://www.thumbshots.com, for websearch result URLs. Default: 0

enable-thumbs 1

enables them.

enabling javascript

Enabling javascript on the websearch results pages enables keyboard shortcuts, and will allow all sort of dynamic treatments in the future. Default: 0

enable-js 1

enales javascript.

enabling background content analysis

This option enables the background download of the content pointed to by websearch results. Running this option makes seeks slower and more bandwith demanding than the default behavior. However, the content aware system has more features, such as better aggregation of websearch snippets from multiple search engines, preemptive caching of webpages pointed to by websearch results and accurate automated similarity analysis and clustering of the results. Default: 0

enable-content-analysis 1

activates the analysis of content in real-time.

connection and transfer timeouts

The options below allow to control the connection and transfer timeouts to the search engines, and to other pages (typically for content analysis).

Default: 3

se-connect-timeout 3

connection timeout to search engines, in seconds.

Default: 5

se-transfer-timeout 5

transfer timeout when connecting to a search engine, in seconds.

Default: 1

ct-connect-timeout 1

connection timeout when fetching content for analysis & caching, in seconds.

Default: 3

ct-transfer-timeout 3

transfer timeout when fetching content for analysis & caching, in seconds.

highlighting the most discriminative words

This option is applicable to version 0.2.2-SOLO and above. It enables a more discriminative highlight of words in result snippets. The highlighted words are those that discriminate the most a snippet from all other snippets in the results.

Default: 1

extended-highlight 1

Enables discriminative highlighting.

Websearch patterns

Seeks supports regexp patterns to either regroup or eliminate some results.

In the source repository, pattern files are found in
src/plugins/websearch/patterns

The following files exist

audio  file_doc  forum  pdf  qi_patterns  reject  video
  • Files audio, file_doc, forum, pdf & video are used by Seeks to regroup results automatically per types.
  • File qi_patterns is used by Seeks to intercept queries in proxy mode.
  • File reject is used by Seeks to eliminate some results to queries. This file is empty by default. Adding regexp rules to the reject file allows to control the results per url, e.g. the author reject any result from experts-exchange.com.

Image Websearch Configuration

Image websearch engine selection

The option img-search-engine allows the selection of a set of search engines to get results from. Currently, Google, Bing, Yahoo, Flickr, and Wikimedia Commons are supported and activated by default.

img-search-engine google
img-search-engine bing
img-search-engine flickr
img-search-engine wcommons
img-search-engine yahoo

enabling image background content analysis

Enables background download of image thumbnails and their analysis for detecting identical and near-identical images. Expected to be both slower and more bandwith demanding than when not activated.

Default: 0

img-content-analysis 1

activates the analysis of images in real-time.

number of results per page

Maximum number of image results on a single page.

Default: 60

img-per-page 60

enabling safe search of images

enables safe search of images (1 for on, 0 for off).

Default: 1

safe-search 1
Personal tools