Search

With the search action, a collection is queried using a search term.

When processing a request, Pandosearch is using multiple fields for each document in the collection. The exact fields depend on the specific configuration for the collection.

Request

To retrieve search results from Pandosearch, make a GET request to the following URL:

https://public.pandosearch.com/:collection/search

All possible parameters need to be provided as query parameters.

In the rest of this page, the full URL and collection name are omitted for readability.

Parameter: q

Use the q query parameter when searching for a full query string. The q parameter is automatically escaped and is always interpreted as a string.

The following call will return all search results for "pandosearch":

search?q=pandosearch

Parameter: size

Use the size query parameter to control the number of results returned for each request. The size parameter expects an integer and defaults to 10.

The following call will return 5 search results for "pandosearch":

search?q=pandosearch&size=5

Parameter: page

When using size, you may want to implement page navigation. This can be done using the page parameter.

The page parameter expects a positive integer value and defaults to 1.

The following call will return result 6-10 for "pandosearch":

search?q=pandosearch&size=5&page=2

As you can see, page and size parameters depend on each other to determine the result set to be retrieved. Also, the total number of hits found determines the available pages to navigate to. Our response data can help you with this. See Data: pagination for more details.

Parameter: facets

Facets are used for categorizing and filtering content. Pandosearch only returns documents matching the facet constraints you provide.

The exact facets configuration is different for every Pandosearch implementation. This documentation provides generic usage information.

General usage

Basic usage of a facet in a GET query parameter is as follows:

facets[:name]=:value
  • :name – String identifier for the facet.
  • :value – A specific facet value to filter results on.

It is also possible to query for multiple values for the same facet:

facets[:name][]=:value1&facets[:name][]=:value2

Note the use of the empty [] brackets to instruct Pandosearch to interpret values as list data.

Pandosearch can return documents matching all values in such a list or documents matching one of the values in the list. This depends on your facets configuration.

Example: docType facet

One of the default facets is the docType facet. This returns which types of documents are present in the current result set.

Let's say our collection contains documents of type "page" and "pdf". The following call will filter the result set to only display "pdf" documents for "pandosearch":

search?q=pandosearch&facets[docType]=pdf

Now, let's assume that there is a third docType value "blog". It is then possible to only return documents that are of type "pdf" or "blog":

search?q=pandosearch&facets[docType][]=pdf&facets[docType][]=blog

As stated before, the exact document matching logic for facets is configurable and can be specifically adjusted to your needs.

Response data

Facet information is returned in the API response in the facets object. Note that this is true even when no facets have been specified to filter on. See Data: facets for more details.

Parameter: full

When returning results, each document is returned with a highlighted content snippet by default. See Highlighting for more information.

This snippet is usually what you want, since you can display the part of the document where the search term was found, giving the end user context about the place where the search term occurred in a document.

Sometimes you may need to return the full text of a field instead.

This can be done using the full parameter:

search?q=pandosearch&full

This will retrieve all documents with their full body, instead of only the highlighted snippet.

Parameter: track

In Pandosearch, all API requests are automatically tracked. We do not track any personal information – only information about the query sent and the results returned.

Tracking can be turned off by using the track parameter. If track=false is passed, the result won't be tracked or included in usage reports. In all other cases, the result will be tracked and included in usage reports.

An example use case is internal testing. You usually don't want your test requests to pollute your search usage analytics. To achieve this, you can use the track parameter as follows:

search?q=test&track=false

Parameter: sort

Pandosearch ranks the results of a request based on relevancy. This means the most relevant content for the given q search term is the top result, as one expects from a search as a service.

For specific customer needs, it is possible to use a custom sorting. If so, we provide you with a custom sorter-key. This sorter-key can be passed to the API using the sort parameter.

Example: let's assume a sorter is configured, which sorts results alphabetically descending on title. This sorter is available via the key titledesc. The following call will return results for "pandosearch", with this specific sorting applied:

search?q=pandosearch&sort=titledesc

Be aware that sorting diminishes the user value of search, as the ranking is completely bypassed when you use sorting.

Response

A search response is a JSON document with the following basic structure:

{
  "total": 12,
  "hits": [
    {
      "type": "page",
      "url": "https://enrise.com/diensten/search/",
      "fields": {
        "title": "<b>Pandosearch</b>. Precies wat je zoekt.",
        "body": "Meer weten over search? Bel: 088-5553311 “Onze klanten krijgen razendsnel toegang tot producten en oplossingen. <b>Pandosearch</b> biedt exact dát wat we nodig hebben.” Waarom site search? Met site search vinden je gebruikers…"
      }
    },
    {},
    {}
  ],
  "facets": {
    "docType": [
      {
        "display": "blog",
        "key": "blog",
        "count": 7
      },
      {},
      {}
    ]
  },
  "request": {
    "q": "pandosearch",
    "page": 1,
    "size": 12,
    "facets": {},
    "full": false,
    "nocorrect": false,
    "track": true,
    "notiming": false
  },
  "received": {
    "q": "pandosearch"
  },
  "pagination": {
    "current": 1,
    "numPages": 1,
    "numResults": 12,
    "prelink": "?q=pandosearch&size=12&full=false&sort=undefined&nocorrect=false&track=true&notiming=false",
    "resultsPerPage": 12
  },
  "timing": {
    "search": 13.06,
    "search:took": 12,
    "request": 16
  }
}

In addition to the above, suggestions for alternative search terms are returned in case no results were found for the current search term.

Data: total

The total property is an integer containing the number of documents found over the full index. Note that this is not always equal to the length of the hits property of the response, since the length of hits is controlled by the size query parameter.

Data: hits

The hits property contains the full collection of documents found in the index. It is always represented as array of objects. Each object looks as follows:

{
  "type": "page",
  "url": "https://enrise.com/diensten/search/",
  "fields": {
    "title": "<b>Pandosearch</b>. Precies wat je zoekt.",
    "body": "Meer weten over search? Bel: 088-5553311 “Onze klanten krijgen razendsnel toegang tot producten en oplossingen. <b>Pandosearch</b> biedt exact dát wat we nodig hebben.” Waarom site search? Met site search vinden je gebruikers…"
  }
}
  • type - The type of document.
  • url – Unique identifier for this document.
  • fields – For each document, a range of fields can be returned. The actual fields are configured for every implementation. Each field value highlights the searchterm in <b></b> tags by default.

Highlighting

When returning results, each document is returned highlighted. This works as follows:

The part where the search term was found in a field of the document is transformed to a snippet. A snippet is (by default) max 250 characters long and it is "smart": words are kept intact as much as possible. The snippet is always created around the search term, so the search term will always be included in the snippet and is wrapped in <b></b> tags (or other markup as configured for your implementation). Per result, only one snippet is returned.

Data: facets

A facet groups the hits in a result set based on configured properties. The facets object in the response contains all configured facets, calculated over the current result set. The display values for each facet depend on the configuration, but generally, it follows the following structure:

{
  ":name": [
    {
      "key": "docType",
      "display": "Document type",
      "count": 0
    }
  ],
  ":name": [],
}

Where :name is a unique string identifier for the facet. Each facet contains an array of so-called "buckets":

  • key – A unique bucket key which can be used for filtering.
  • display – A human-friendly bucket name meant for displaying purposes.
  • count – The number of documents in this bucket.

The number of buckets returned per facet is configurable.

Data: suggestions

When searching for a term that does not return results, a suggestions object is included as part of the response.

Let's say someone actually tries to search for Pandosearch, but misspells it, like "pnadosreach" (two typos). The following suggestions data will then be part of the response:

{
  "total": 0,
  "hits": [],
  "facets": {
    "docType": []
  },
  "suggestions": {
    "didyoumean": {
      "text": "pandosearch",
      "highlighted": "<i>pandosearch</i>"
    }
  },
}

Did you mean

Within the suggestions object, a didyoumean suggestion is included with the following properties:

  • text – The suggested search term in plain text.
  • highlighted – The suggested search term highlighted.

The suggestion given is based on the word in the index that looks most like the query. A query can be maximum two errors off from a original word. If a word includes more typos it will not be matched.

Given our example, this means that:

  • "pnadosearch" will return "pandosearch" as didyoumean suggestion;
  • "pnadosaerch" will too (two typos, both one position off);
  • "pnadosreach" will not (the "r" is more than one position off) and
  • "pnaodsaerhc: will not (too many typos).

In addition to the above, it is possible to give search results for the didyoumean suggestion right away. This can be configured by us for your implementation.

Let's repeat our previous example with didyoumean results enabled. The following properties are now in the response:

{
  "total": 0,
  "hits": [],
  "facets": {
    "docType": []
  },
  "suggestions": {
    "didyoumean": [
      {
        "text": "pandosearch",
        "highlighted": "<i>pandosearch</i>",
        "assumed": true,
        "result": {
          "total": 12,
          "hits": [
            {
              "type": "page",
              "url": "https://enrise.com/diensten/search/",
              "fields": {
                "title": "<b>Pandosearch</b>. Precies wat je zoekt.",
                "body": "Meer weten over search? Bel: 088-5553311 “Onze klanten krijgen razendsnel toegang tot producten en oplossingen. <b>Pandosearch</b> biedt exact dát wat we nodig hebben.” Waarom site search? Met site search vinden je gebruikers…"
              }
            },
            {}
          ]
        }
      }
    ]
  },
}

You see that assumed: true is added to the didyoumean suggestion and that a result object is added. The result object contains all properties of a normal search response, given that the didyoumean suggestion is the search term ("pandosearch" in the example above).

Returning the exact same data structure is intentional, as it allows for normal search response processing logic to be reused. For your search end users, this enables you to provide instant search responses without a need to manually correct typos first.

Data: request

This is an object containing the full request object with which a request was performed in our API. It contains all query string parameters you provided and all default settings for other parameters.

Data: received

This is an object containing only the sanitized query string parameters you provided.

Data: pagination

By default, a pagination object is returned, which contains all useful information for building up pagination for your search request. Example:

{
  "pagination": {
    "current": 1,
    "numPages": 1,
    "numResults": 12,
    "prelink": "?q=pandosearch&size=12&full=false&sort=undefined&nocorrect=false&track=true&notiming=false",
    "resultsPerPage": 12
  }
}

The properties in this object are:

  • current – Integer containing the current page
  • numPages – Integer containing the total number of pages
  • numResults – Integer containing the total number of results
  • prelink – Query string where the page= querystring parameter can be appended to
  • resultsPerPage – Integer containing the number of results per page

Data: timing

This is an object containing all information about the time your request took. It is divided into:

  • search:took – Total amount of time spent by Elasticsearch in ms.
  • search – Total amount of time between sending the request to Elasticsearch and the receiving a back from Elasticsearch in ms (this is basically search:took + network overhead).
  • request – Total amount of time between the request coming in and sending the response out in ms.