docs(indexing): Removing mentions of indexing_mode in the doc
This commit is contained in:
parent
531c90777b
commit
bfc23df571
@ -35,7 +35,6 @@ const sidebarMenu = [
|
|||||||
{ title: 'Options', url: 'options.html' },
|
{ title: 'Options', url: 'options.html' },
|
||||||
{ title: 'Commandline', url: 'commandline.html' },
|
{ title: 'Commandline', url: 'commandline.html' },
|
||||||
{ title: 'Hooks', url: 'hooks.html' },
|
{ title: 'Hooks', url: 'hooks.html' },
|
||||||
{ title: 'Indexing modes', url: 'indexing-modes.html' },
|
|
||||||
],
|
],
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -48,11 +47,7 @@ const sidebarMenu = [
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
title: 'Tutorials',
|
title: 'Tutorials',
|
||||||
items: [
|
items: [{ title: 'Blog', url: 'blog.html' }],
|
||||||
{ title: 'Blog', url: 'blog.html' },
|
|
||||||
// { title: 'Dropdown menu', url: 'autocomplete.html' },
|
|
||||||
// { title: 'Collection search', url: 'collections.html' },
|
|
||||||
],
|
|
||||||
},
|
},
|
||||||
];
|
];
|
||||||
|
|
||||||
|
|||||||
@ -75,7 +75,7 @@ want to keep this key secret and not commit it to your versioning system.
|
|||||||
|
|
||||||
![jekyll algolia command example][6]
|
![jekyll algolia command example][6]
|
||||||
|
|
||||||
_Note that in the animation I simplified the method call to `jekyll algolia` by using an
|
_Note that in the animation we simplified the method call to `jekyll algolia` by using an
|
||||||
[alternative way][7] of loading the API key and using [rubygems-bundler][8] to
|
[alternative way][7] of loading the API key and using [rubygems-bundler][8] to
|
||||||
remove the need to add `bundle exec`._
|
remove the need to add `bundle exec`._
|
||||||
|
|
||||||
|
|||||||
@ -5,6 +5,12 @@ layout: content-with-menu.pug
|
|||||||
|
|
||||||
# How does this work?
|
# How does this work?
|
||||||
|
|
||||||
|
This page will give you a bit more insight about how the internals of the plugin
|
||||||
|
are working. This should give you more context to better understand the various
|
||||||
|
options you can configure.
|
||||||
|
|
||||||
|
## Extracting data
|
||||||
|
|
||||||
The plugin will work like a `jekyll build` run, but instead of writing `.html`
|
The plugin will work like a `jekyll build` run, but instead of writing `.html`
|
||||||
files to disk, it will push content to Algolia. It will go through each file
|
files to disk, it will push content to Algolia. It will go through each file
|
||||||
Jekyll would have processed in a regular build: pages, posts and collections.
|
Jekyll would have processed in a regular build: pages, posts and collections.
|
||||||
@ -53,16 +59,26 @@ front-matter). Specific data is the paragraph content, and information
|
|||||||
about its position in the page (where its situated in the hierarchy of headings
|
about its position in the page (where its situated in the hierarchy of headings
|
||||||
in the page).
|
in the page).
|
||||||
|
|
||||||
Once displayed, results are grouped so only the best matching paragraph of each
|
Using the [distinct setting][1] of the Algolia API, only the best matching
|
||||||
page is returned for a specific query. This greatly improves the perceived
|
paragraph of each page is returned for a specific query. This greatly improves
|
||||||
relevance of the search results.
|
the perceived relevance of the search results as you can highlight specifically
|
||||||
|
the part that was matching.
|
||||||
|
|
||||||
Because the plugin is splitting each page into smaller chunks, it can be hard to get
|
## Pushing data
|
||||||
an estimate of how many records will actually be pushed. The plugin tries to be
|
|
||||||
smart and consume as less operations as possible, but you can always run it in
|
|
||||||
`--dry-run` mode to better understand what it would do.
|
|
||||||
|
|
||||||
![jekyll algolia dry run example][1]
|
The plugin tries to be smart by using as less operations as possible, to be
|
||||||
|
mindful of your Algolia quota. Whenever you run `jekyll algolia`, only records
|
||||||
|
that changed since your last push will be updated.
|
||||||
|
|
||||||
[1]: ./assets/images/dry-run.gif
|
This is made possible because each record is attributed a unique `objectID`,
|
||||||
|
computed as a hash of the actual content of the record. Whenever the content of
|
||||||
|
the record changes, its `objectID` will change as well. This allows us to compare
|
||||||
|
what is current available in your index and what is about to be pushed, to only
|
||||||
|
update what actually changed.
|
||||||
|
|
||||||
|
Previous outdated records will be deleted, and new updated records will be added
|
||||||
|
instead. All those operations are grouped into a batch call, making sure that
|
||||||
|
the changes are done atomically: your index will never be in an inconsistent
|
||||||
|
state where records are only partially updated.
|
||||||
|
|
||||||
|
[1]: https://www.algolia.com/doc/guides/ranking/distinct/?language=ruby#distinct-to-index-large-records
|
||||||
|
|||||||
@ -1,70 +0,0 @@
|
|||||||
---
|
|
||||||
title: Indexing modes
|
|
||||||
layout: content-with-menu.pug
|
|
||||||
---
|
|
||||||
|
|
||||||
# Indexing modes
|
|
||||||
|
|
||||||
Algolia's pricing model is based on the number of records you have in your index
|
|
||||||
as well as the number of add/edit/delete operations you operate on your index
|
|
||||||
per month.
|
|
||||||
|
|
||||||
By default, the plugin tries to be mindful of your quota and act in a smart way
|
|
||||||
by default: only updating records that changed between two runs.
|
|
||||||
|
|
||||||
It does so by attributing a unique `objectID` to each record, generated from the
|
|
||||||
actual content of this record. If the content changes, then the `objectID` will
|
|
||||||
change as well.
|
|
||||||
|
|
||||||
Because of this mechanism, the plugin can know which records changed between two
|
|
||||||
runs and will delete the records that are no longer needed and push the new ones
|
|
||||||
instead. Doing so only consumes a small number of operations (instead of pushing
|
|
||||||
everything each time).
|
|
||||||
|
|
||||||
When using the default `indexing_mode` value (`diff`), all those changes are
|
|
||||||
batched into one call to the API. They will be executed atomically (the index
|
|
||||||
will be updated with all the changes in one go, instead of one record at
|
|
||||||
a time). This allow users of the website to always search into the most
|
|
||||||
up-to-date version of the data.
|
|
||||||
|
|
||||||
This should work for 99% of the use-cases and you shouldn't need to change the
|
|
||||||
value of the `indexing_mode`.
|
|
||||||
|
|
||||||
|
|
||||||
## `diff` (default)
|
|
||||||
|
|
||||||
Using the default `diff` mode, the plugin will try to be smart when pushing
|
|
||||||
content to your index: it will only add/edit/delete what changed. All
|
|
||||||
records that didn't change will stay untouched.
|
|
||||||
|
|
||||||
To do so, it first grabs the list of all records in your index, then compares
|
|
||||||
them with the records generated locally. It then deletes the old records that no
|
|
||||||
longer exists, and add the newly created ones.
|
|
||||||
|
|
||||||
There is no notion of "updating" a record here because as soon as the content of
|
|
||||||
a record changes, it will be considered as a new record (thus, the old version
|
|
||||||
will be deleted and the new one will be added).
|
|
||||||
|
|
||||||
### Cons
|
|
||||||
|
|
||||||
All operations will be done on the same index, sequentially. Old records will
|
|
||||||
first be discarded, then new ones will be added. Users doing a search on your
|
|
||||||
website during the update will have inconsistent or incomplete results.
|
|
||||||
|
|
||||||
## `atomic`
|
|
||||||
|
|
||||||
The `atomic` mode solves the inconsistency issue of the `diff` mode. Instead of
|
|
||||||
doing all changes in sequence on the same index, the updates will be done on
|
|
||||||
a temporary index in the background.
|
|
||||||
|
|
||||||
The plugin will start by making a copy of the existing data, and will then apply
|
|
||||||
the `diff` method to it: it will remove old records and add new ones to this
|
|
||||||
index. While those changes are applied, your current index is still serving
|
|
||||||
search queries by your users. Once all changes are applied, the plugin will
|
|
||||||
replace the current public index with the temporary one, all in one atomic move.
|
|
||||||
|
|
||||||
### Cons
|
|
||||||
|
|
||||||
As this method will need to create a copy of your current index during indexing,
|
|
||||||
it means you will need an Algolia plan that can hold at least **twice** the
|
|
||||||
number of records.
|
|
||||||
@ -46,17 +46,16 @@ been changed:
|
|||||||
[extensions_to_index][3]. Note that for the last one, it now expects
|
[extensions_to_index][3]. Note that for the last one, it now expects
|
||||||
a comma-separated list of extensions.
|
a comma-separated list of extensions.
|
||||||
|
|
||||||
The `lazy_update` option has renamed to [indexing_mode][4]. The default indexing
|
The `lazy_update` option does not exist anymore. The new indexing mode is
|
||||||
mode ([diff][5]), is equivalent to `lazy_update: true`. This means that by
|
equal to `lazy_update: true`. Only records that changed between the current
|
||||||
default the plugin will now be smart enough to only update records that actually
|
build and the previous one will be updated, and it will even be done in an
|
||||||
changed since the last run. You can still get the old behavior of re-pushing
|
atomic way (all in one go).
|
||||||
everything every time by using the [atomic][6] indexing mode.
|
|
||||||
|
|
||||||
## Hooks
|
## Hooks
|
||||||
|
|
||||||
All three hooks (`custom_hook_excluded_file?`, `custom_hook_each` and
|
All three hooks (`custom_hook_excluded_file?`, `custom_hook_each` and
|
||||||
`custom_hook_all`) are still here, but they have been renamed to
|
`custom_hook_all`) are still here, but they have been renamed to
|
||||||
[should_be_excluded?][7], [before_indexing_each][8] and [before_indexing_all][9].
|
[should_be_excluded?][4], [before_indexing_each][5] and [before_indexing_all][6].
|
||||||
|
|
||||||
They all have the same behavior and expect the same arguments as before, but
|
They all have the same behavior and expect the same arguments as before, but
|
||||||
should now extend the `Jekyll::Algolia::Hooks` module. It means that the file
|
should now extend the `Jekyll::Algolia::Hooks` module. It means that the file
|
||||||
@ -72,7 +71,7 @@ module Jekyll
|
|||||||
end
|
end
|
||||||
```
|
```
|
||||||
|
|
||||||
You can find the complete documentation on the [dedicated page][10].
|
You can find the complete documentation on the [dedicated page][7].
|
||||||
|
|
||||||
## Records
|
## Records
|
||||||
|
|
||||||
@ -114,17 +113,14 @@ Here is an example of a record extracted by the plugin:
|
|||||||
## Need more help?
|
## Need more help?
|
||||||
|
|
||||||
If you need more help migrating from the previous plugin to this new version,
|
If you need more help migrating from the previous plugin to this new version,
|
||||||
you can [file an issue][11] on the GitHub repo and we'll do our best to help you.
|
you can [file an issue][8] on the GitHub repo and we'll do our best to help you.
|
||||||
|
|
||||||
|
|
||||||
[1]: ./options.html#files-to-exclude
|
[1]: ./options.html#files-to-exclude
|
||||||
[2]: ./options.html#nodes-to-index
|
[2]: ./options.html#nodes-to-index
|
||||||
[3]: ./options.html#extensions-to-index
|
[3]: ./options.html#extensions-to-index
|
||||||
[4]: ./options.html#indexing-mode
|
[4]: ./hooks.html#should-be-excluded
|
||||||
[5]: ./indexing-modes.html#diff-default
|
[5]: ./hooks.html#before-indexing-each
|
||||||
[6]: ./indexing-modes.html#atomic
|
[6]: ./hooks.html#before-indexing-all
|
||||||
[7]: ./hooks.html#should-be-excluded
|
[7]: ./hooks.html
|
||||||
[8]: ./hooks.html#before-indexing-each
|
[8]: https://github.com/algolia/jekyll-algolia/issues
|
||||||
[9]: ./hooks.html#before-indexing-all
|
|
||||||
[10]: ./hooks.html
|
|
||||||
[11]: https://github.com/algolia/jekyll-algolia/issues
|
|
||||||
|
|||||||
@ -62,27 +62,6 @@ algolia:
|
|||||||
_Note that some files (pagination pages, static assets, etc) will **always** be
|
_Note that some files (pagination pages, static assets, etc) will **always** be
|
||||||
excluded and you don't have to specify them._
|
excluded and you don't have to specify them._
|
||||||
|
|
||||||
## `indexing_batch_size`
|
|
||||||
|
|
||||||
The Algolia API allows you to send batches of changes to add or update several
|
|
||||||
records at once, instead of doing one HTTP call per record. The plugin will
|
|
||||||
batch updates by groups of 1000 records by default.
|
|
||||||
|
|
||||||
If you are on an unstable internet connection, you might want to decrease the
|
|
||||||
value. You will send more batches, but each will be smaller in size.
|
|
||||||
|
|
||||||
```yml
|
|
||||||
algolia:
|
|
||||||
# Send fewer records per batch
|
|
||||||
indexing_batch_size: 500
|
|
||||||
```
|
|
||||||
|
|
||||||
## `indexing_mode`
|
|
||||||
|
|
||||||
This option will let you choose the strategy used to sync your data with your
|
|
||||||
Algolia index. The default value should work for most cases, but feel free to
|
|
||||||
[read the pros and cons][4] of each and pick the one best suited for your needs.
|
|
||||||
|
|
||||||
## `nodes_to_index`
|
## `nodes_to_index`
|
||||||
|
|
||||||
This options defines how each page is split into chunks. It expects
|
This options defines how each page is split into chunks. It expects
|
||||||
@ -108,7 +87,7 @@ This option let you pass specific settings to your Algolia index.
|
|||||||
By default the plugin will configure your Algolia index with settings tailored
|
By default the plugin will configure your Algolia index with settings tailored
|
||||||
to the format of the extracted records. You are of course free to overwrite
|
to the format of the extracted records. You are of course free to overwrite
|
||||||
them or configure them as best suits your needs. Every option passed to the
|
them or configure them as best suits your needs. Every option passed to the
|
||||||
`settings` entry will be set as [settings to your index][5].
|
`settings` entry will be set as [settings to your index][4].
|
||||||
|
|
||||||
For example if you want to change the HTML tag used for the highlighting, you
|
For example if you want to change the HTML tag used for the highlighting, you
|
||||||
can overwrite it like this:
|
can overwrite it like this:
|
||||||
@ -120,9 +99,26 @@ algolia:
|
|||||||
highlightPostTag: '</em>'
|
highlightPostTag: '</em>'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## `indexing_batch_size`
|
||||||
|
|
||||||
|
This option defines the number of operations that will be grouped as part of one
|
||||||
|
updating batch. All operations of one batch are applied atomically. The default
|
||||||
|
value is `1000`.
|
||||||
|
|
||||||
|
You might want to increase this value if you are doing a lot of updates on each
|
||||||
|
run and still want to have your changes done atomically.
|
||||||
|
|
||||||
|
You might want to decrease this value if you're using an unstable internet
|
||||||
|
connection. Smaller batches are easier to send that large ones.
|
||||||
|
|
||||||
|
```yml
|
||||||
|
algolia:
|
||||||
|
# Send fewer records per batch
|
||||||
|
indexing_batch_size: 500
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
[1]: ./how-it-works.html
|
[1]: ./how-it-works.html
|
||||||
[2]: http://www.methods.co.nz/asciidoc/
|
[2]: http://www.methods.co.nz/asciidoc/
|
||||||
[3]: https://github.com/textile
|
[3]: https://github.com/textile
|
||||||
[4]: ./indexing-modes.html
|
[4]: https://www.algolia.com/doc/api-reference/api-methods/set-settings/?language=ruby#set-settings
|
||||||
[5]: https://www.algolia.com/doc/api-reference/api-methods/set-settings/?language=ruby#set-settings
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user