Seeks On Web

From Seeks

Jump to: navigation, search

Once you have setup a Seeks proxy and have it running locally or on a remote server, you can make it available to the public directly through a Web site if you wish to.

This allows users to simply pass their queries to Seeks in a webpage, similarly to traditional search engines.

There are two ways to do so. The first solution uses a very light HTTP server built-in as a plugin in version of Seeks >= 0.2.3. The second solution requires an external webserver. The first solution (plugin) gives better performance to your node. The second solution is heavier but has some advantages, especially:

  • it allows the use of SSL whereas the HTTP server plugin based on libevent has no support for encryption yet.
  • it allows to run the webserver on a different machine than that of Seeks itself, something that can't be done with the built-in plugin.

Option 1, built-in plugin for Seeks >= 0.2.3:

  • a running Seeks proxy.

Option 2, with an external webserver.

  • a running Seeks proxy,
  • running webserver,
  • a script to route Web queries to the proxy.

Contents

Option 1, built-in plugin

Again, you need Seeks >= 0.2.3. The server requires libevent to run.

When compiling Seeks, enable the HTTP server plugin:

./configure --enable-httpserv-plugin=yes --with-libevent=/your/path/to/libevent

Then compile. Before running, you must add the following to your src/proxy/config file:

activated-plugin httpserv

Then run Seeks, at startup you should see a line indicating that the webserver is running.

By default the server runs on localhost:8080. You can change this behavior by editing

src/plugins/httpserv/httpserv-config

from the sources.

On public nodes, it is recommended to use a robots.txt file to block crawlers to hit your websearch node. The robots.txt file must be put in the websearch/public repository. If you are running Seeks from the source repository, add your robots.txt to

src/public/

If you have installed Seeks in your home repository or on your system, add your robots.txt file to

<your_install_repository>/share/seeks/public/

Option 2, external webserver

You must set up the webserver by yourselves. Then the required scripts are given below, for Django or a PHP framework, pick the one you prefer. For beginners, we recommend you use the PHP script.

Django

settings.py

SEEKS_PROXY = 'http://localhost:8118'
SEEKS_URI = 'http://s.s/'
SEEKS_PATH = 'seeks/'

urls.py

from django.conf import settings

[...]

     (r'^%s(?P<path>.*)$' % settings.SEEKS_PATH, 'PROJECTNAME.seeks.views.seeks'),

seeks/views.py

Use this script for versions of Seeks lower than Bubs-0.2-beta

# Copyright Camille Harang
# 
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as
# published by the Free Software Foundation, either version 3 of the
# License, or (at your option) any later version.
# 
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# 
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see http://www.fsf.org/licensing/licenses/agpl-3.0.html.

import urllib2
from urlparse import urljoin
from django.conf import settings
from django.http import HttpResponse, HttpResponseRedirect, HttpResponseServerError

def seeks(request, path):

    if path == '': return HttpResponseRedirect('websearch-hp')

    public_url = urljoin(settings.ROOT_URL, settings.SEEKS_PATH)
    local_url = urljoin(settings.SEEKS_URI, path)
    if 'QUERY_STRING' in request.META and request.META['QUERY_STRING']:
        local_url = '%s?%s' % (local_url, request.META['QUERY_STRING'])

    opener = urllib2.build_opener(urllib2.ProxyHandler({'http': settings.SEEKS_PROXY}))
    headers = [('Seeks-Remote-Location', public_url)]
    if 'HTTP_ACCEPT_LANGUAGE' in request.META:
        headers.append(('Accept-Language', request.META['HTTP_ACCEPT_LANGUAGE']))
    opener.addheaders = headers

    try:

        o = opener.open(urllib2.Request(local_url))

        info = o.info()
        mime = ''
        if 'content-type' in info: mime = info['content-type'].split(';')[0]
        else: mime = ''

    except urllib2.HTTPError, err:     return HttpResponseServerError('ERROR %s' % err)
    except urllib2.URLError, err:      return HttpResponseServerError('ERROR %s' % err.__getitem__(0)[1])
    except httplib.BadStatusLine, err: return HttpResponseServerError('ERROR %s' % err)
    except:                            return HttpResponseServerError('ERROR')
        
    return HttpResponse(o.read(), mimetype=mime)

PHP

Dependencies

Code

Use this script for versions of Seeks lower than Bubs-0.2-beta

<?

/* Copyright Camille Harang

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program. If not, see http://www.fsf.org/licensing/licenses/agpl-3.0.html. */

if (array_key_exists('HTTPS', $_SERVER) && $_SERVER['HTTPS']) $scheme = 'https://';
else $scheme= 'http://';

$seeks_uri = 'http://s.s';
$proxy = 'localhost:8118';
$base_script = $_SERVER['SCRIPT_NAME'];
$base_url = $scheme.$_SERVER['HTTP_HOST'].$base_script;

if ($_SERVER['REQUEST_URI'] == '/') { header('Location: '.$base_url.'/websearch-hp'); }
else $url = $seeks_uri.str_replace($base_script, '', $_SERVER['REQUEST_URI']);

$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_PROXY, $proxy);
curl_setopt ($curl, CURLOPT_RETURNTRANSFER, 1) ;
$header = array("Seeks-Remote-Location: ".$base_url);
if (array_key_exists('HTTP_ACCEPT_LANGUAGE', $_SERVER)) $header[] = "Accept-Language: ".$_SERVER['HTTP_ACCEPT_LANGUAGE'];
curl_setopt ($curl, CURLOPT_HTTPHEADER, $header);
$result = curl_exec($curl);
$result_info = curl_getinfo($curl);
if(curl_errno($curl)) echo 'CURL ERROR: '.curl_error($curl);
curl_close($curl);

header('Content-Type: '.$result_info['content_type']);

echo $result;
?>

Depending on your configuration, you may have to perform certain changes to the script above.

For now, contact us if you need help setting your Web interface as we will update this page based on experience by users.

Anonymization

Queries to Seeks logged on /dev/null

On *nix systems:

./seeks 2> /dev/null

Apache

 SetEnvIf Request_URI "^/seeks/" seeks # Set the appropriate pattern matching Seeks's location on you server
 CustomLog /dev/null env=!seeks
 CustomLog /var/www/access.log combined env=!seeks # Or your usual logging file

Lighttpd

 $HTTP["host"] =~ "^your_node_address$" 
 {
 accesslog.filename = "/dev/null"
 server.errorlog = "/dev/null"
 }

Tips

How to prevent Seeks from crashing

  • Help debugging it?
  • Run it endlessly (cool in a screen):
while true ; 
do ./seeks ; 
done
  • Another way of doing the same thing, but with cron and running seeks as a daemon:

Add the following line to your crontab file:

*/5 * * * * root [ ! -f /var/run/seeks.pid -o -z "$(cat /var/run/seeks.pid 2>/dev/null )" -o ! -d "/proc/$(cat /var/run/seeks.pid 2>/dev/null)" ]  && cd seekpath && ./seeks --daemon

where seekpath is the path to your version of seeks. This will check on a possibly dead seeks every 5 minutes.

Run seeks, with the arguments:

./seeks --daemon --pidfile /var/run/seeks.pid

On public nodes, it is recommended you use a robots.txt to block crawlers that may try to hit your websearch node and stress it for no purpose.

Built-in http server and lighttpd as a reverse-proxy

To use lighttpd as a reverse-proxy and have faster results, you can use the following lighttpd configuration snippet :

$HTTP["host"] =~ "seeks.zat.im" { 
  proxy.server  = ( "" => (( "host" => "127.0.0.1", "port" => 8080 )) ) 
}

You have to replace "seeks.zat.im" and the port of the proxy (8080).

Set up an access control list for an open proxy

It is recommended you control who can use your external proxy if it is open for connection by outsiders.

To do so, modify the options permit-access and deny-access in the proxy configuration file (in the sources in src/config).

See the detailed configuration in the proxy config file itself.

Personal tools