Seeks Configuration
From Seeks
Seeks has several configurable elements,
- the proxy,
- the websearch plugin,
- the image websearch plugin.
Contents |
Proxy configuration
The proxy is a hack of Privoxy so the configuration file looks similar.
Websearch plugin configuration
The websearch plugin configuration file is
src/plugins/websearch/websearch-config
All the following configuration options and their default values are to be found in this file.
websearch language
The option search-language defines the websearch language of preference.
Default: en
search-language fr
Sets the language to French.
- automatic detection is based on http headers.
- default: auto
number of results per page
Maximum number of websearch results on a single page. Default: 10
search-results-page 10
websearch engine selection
The option search-engine allows the selection of a set of search engines to get results from. Currently, Google, Bing, Yahoo, Cuil, and Exalead are supported and activated by default.
search-engine google search-engine bing search-engine cuil search-engine yahoo search-engine exalead
websearch cache expiration (seconds)
Minimum number of seconds search results are kept in the system cache, for reuse, update, etc... while not being used. The cache is per query, and is resetted every time an alive query is accessed. Default: 300
query-context-delay 300
enabling thumbnails
The following option enables the insertion of thumbnails from http://www.thumbshots.com, for websearch result URLs. Default: 0
enable-thumbs 1
enables them.
enabling javascript
Enabling javascript on the websearch results pages enables keyboard shortcuts, and will allow all sort of dynamic treatments in the future. Default: 0
enable-js 1
enales javascript.
enabling background content analysis
This option enables the background download of the content pointed to by websearch results. Running this option makes seeks slower and more bandwith demanding than the default behavior. However, the content aware system has more features, such as better aggregation of websearch snippets from multiple search engines, preemptive caching of webpages pointed to by websearch results and accurate automated similarity analysis and clustering of the results. Default: 0
enable-content-analysis 1
activates the analysis of content in real-time.
connection and transfer timeouts
The options below allow to control the connection and transfer timeouts to the search engines, and to other pages (typically for content analysis).
Default: 3
se-connect-timeout 3
connection timeout to search engines, in seconds.
Default: 5
se-transfer-timeout 5
transfer timeout when connecting to a search engine, in seconds.
Default: 1
ct-connect-timeout 1
connection timeout when fetching content for analysis & caching, in seconds.
Default: 3
ct-transfer-timeout 3
transfer timeout when fetching content for analysis & caching, in seconds.
highlighting the most discriminative words
This option is applicable to version 0.2.2-SOLO and above. It enables a more discriminative highlight of words in result snippets. The highlighted words are those that discriminate the most a snippet from all other snippets in the results.
Default: 1
extended-highlight 1
Enables discriminative highlighting.
Websearch patterns
Seeks supports regexp patterns to either regroup or eliminate some results.
In the source repository, pattern files are found insrc/plugins/websearch/patterns
The following files exist
audio file_doc forum pdf qi_patterns reject video
- Files audio, file_doc, forum, pdf & video are used by Seeks to regroup results automatically per types.
- File qi_patterns is used by Seeks to intercept queries in proxy mode.
- File reject is used by Seeks to eliminate some results to queries. This file is empty by default. Adding regexp rules to the reject file allows to control the results per url, e.g. the author reject any result from experts-exchange.com.
Image Websearch Configuration
Image websearch engine selection
The option img-search-engine allows the selection of a set of search engines to get results from. Currently, Google, Bing, Yahoo, Flickr, and Wikimedia Commons are supported and activated by default.
img-search-engine google img-search-engine bing img-search-engine flickr img-search-engine wcommons img-search-engine yahoo
enabling image background content analysis
Enables background download of image thumbnails and their analysis for detecting identical and near-identical images. Expected to be both slower and more bandwith demanding than when not activated.
Default: 0
img-content-analysis 1
activates the analysis of images in real-time.
number of results per page
Maximum number of image results on a single page.
Default: 60
img-per-page 60
enabling safe search of images
enables safe search of images (1 for on, 0 for off).
Default: 1
safe-search 1
