Go to file

Ashwin Maroli e94055d116 relax version constraints in README

update badges and documentation on supported Ruby and Jekyll versions

2017-11-28 14:19:08 +01:00

docs

Update README and remove old files

2017-11-17 13:47:10 +01:00

errors

Display an error if no records can be indexed

2017-11-17 18:55:23 +01:00

lib

Correctly applying the _all hook only once on all records

2017-11-17 20:33:01 +01:00

scripts

test(all): Make sure we install bundler in all versions

2017-11-27 15:00:12 +01:00

spec

Correctly applying the _all hook only once on all records

2017-11-17 20:33:01 +01:00

.coveralls.yml

test(coveralls): Add coverall files

2015-07-16 18:12:34 +02:00

.gitignore

fix(v3): Stop throwing deprecation warning when using Jekyll v3

2016-01-11 09:47:58 +01:00

.rspec

test(rspec): Start adding rspec tests

2015-07-02 11:05:23 +02:00

.rubocop.yml

Consider focused tests as linting errors

2017-11-14 17:08:32 +01:00

.travis.yml

test(travis): Enable Travis tests

2017-11-27 14:55:36 +01:00

CONTRIBUTING.md

Rename files from algoliasearch-jekyll to jekyll-algolia

2017-11-07 16:46:46 +01:00

Gemfile

Refactoring skeleton, by splitting into specific classes

2017-11-07 22:07:46 +01:00

Guardfile

Only reload changed tests, not the whole suite

2017-11-10 18:00:30 +01:00

jekyll-algolia.gemspec

set minimum required Ruby Version to 2.3.0

2017-11-28 14:19:08 +01:00

LICENSE.txt

chore(license): bump year

2017-11-23 16:02:10 +01:00

Rakefile

Make tests work on all ruby versions

2017-11-15 10:00:41 +01:00

README.md

relax version constraints in README

2017-11-28 14:19:08 +01:00

README.md

Jekyll Algolia Plugin

Jekyll plugin to automatically index your content on Algolia.

⚠ Unreleased beta version

This plugin has not yet been released on Rubygems. If you wish to try it, simply point your Gemfile to the develop branch of this repo:

group :jekyll_plugins do
  gem "jekyll-algolia", git: "https://github.com/algolia/jekyll-algolia", branch: "develop"
end

Alternatively, clone the repository first and then update your site's Gemfile to point to the path on disk like this:

group :jekyll_plugins do
  gem "jekyll-algolia", :path => "/path/to/the/cloned/repo"
end

Feedback very welcome!

Usage

$ bundle exec jekyll algolia

This will push the content of your Jekyll website to your Algolia index.

Installation

The plugin requires a minimum version of Jekyll 3.6.0 and a minimum Ruby version of 2.3.0.

First, add the jekyll-algolia gem to your Gemfile, in the :jekyll_plugins section.

If you do not have a Gemfile already, here is the minimal content to get your started. You will also need Bundler to be able to use the Gemfile.

source 'https://rubygems.org'

gem 'jekyll', '~> 3.6'

group :jekyll_plugins do
  gem 'jekyll-algolia'
end

Once this is done, download all dependencies with bundle install.

If everything went well, you should be able to run jekyll help and see the algolia subcommand listed.

Basic configuration

You need to provide certain Algolia credentials for this plugin to successfully index your site.

If you don't yet have an Algolia account, you can open a free Community plan here. Once signed in, you can get your credentials from your dashboard.

The plugin will try to fetch the credentials from your environment-variables hash and fallback to your Jekyll configuration if not found.

To pass the credentials as ENV variables, you can do so at the same time when you run the jekyll algolia command

# for example

ALGOLIA_APPLICATION_ID='your_application_id' jekyll algolia

The valid ENV variables are:

key	value
ALGOLIA_APPLICATION_ID	`your_application_id`
ALGOLIA_API_KEY	`your_api_key`
ALGOLIA_INDEX_NAME	`your_index_name`

As a fallback measure, the plugin also checks if application_id and index_name are available under algolia key in your _config.yml file like this:

# _config.yml

algolia:
  application_id: 'your_application_id'
  index_name:     'your_index_name'

⚠ Other, unsecure, method ⚠

You can also store your confidential API key in a file named _algolia_api_key, in your source directory. If you do this we very, very, very strongly encourage you to make sure the file is not tracked in your versioning system.

How it works

The plugin will work like a jekyll build run, but instead of writing .html files to disk, it will push content to Algolia.

It will split each page of your website into small chunks (by default, one per <p> paragraph) and then push each chunk as a new record to Algolia. Splitting records that way yields a better relevance of results even on long pages.

The placement of each paragraph in the page heading hierarchy (title, subtitles through <h1> to <h6>) is also taken into account to further improve relevance of results.

Each record will also contain metadata about the page it was extracted from (including slug, url, tags, categories, collection and any custom field added to the front-matter).

Every time you run jekyll algolia, a full build of the website is run locally, but only records that were changed since your last build will be updated in your index.

Advanced configuration

The plugin should work out of the box for most websites, but there are options you can tweak if needed. All the options should be added under the algolia section of your _config.yml file.

`nodes_to_index`

By default, each page of your website will be split into chunks based on this CSS selector. The default value of p means that one record will be created for each <p> in your generated content.

If you would like to index other elements, like <blockquote>, <li> or a custom <div class="paragraph">. If so, you should edit the value like this:

algolia:
  # Also index quotes, list items and custom paragraphs
  nodes_to_index: 'p,blockquote,li,div.paragraph'

`extensions_to_index`

By default, pages for which the source are HTML or Markdown files will be indexed. If you are using another markup language (such as AsciiDoc or Textile, then you should overwrite this option.

For example, the md extension here means that *.md source files will be computed to generate their HTML version, which will be indexed.

algolia:
  # Also index AsciiDoc and Textile files
  extensions_to_index: 'html,md,adoc,textile'

`files_to_exclude`

The plugin will try to be smart in the pages it should not index. Some files will always be excluded from the indexing (static assets, custom 404 and pagination pages). Others are handled by the files_to_exclude option.

By default it will exclude all the index.html and index.md files. Those files are usually not containing much text (landing pages) or containing redundant text (latest blog articles) so we decided to exclude them by default.

If you actually want to index those files, you should set the value to an empty array.

algolia:
  # Actually index the index.html/index.md pages
  files_to_exclude: []

If you want to exclude more files, you should add them to the array. Note that you can use glob patterns to exclude several files at once.

algolia:
  # Exclude more files from indexing
  files_to_exclude:
    - index.html
    - index.md
    - excluded-file.html
    - _posts/2017-01-20-date-to-forget.md
    - subdirectory/*.html

`settings`

By default the plugin will configure your Algolia index with settings tailored to the format of the extracted records. You are of course free to overwrite them or configure them as best suits your needs. Every option passed to the settings entry will passed to a call to set_settings.

For example if you want to change the HTML tag used for the highlighting, you can overwrite it like this:

algolia:
  settings:
    highlightPreTag: '<em class="custom_highlight">'
    highlightPostTag: '</em>'

`indexing_batch_size`

The Algolia API allows you to send batches of changes to add or update several records at once, instead of doing one HTTP call per record. The plugin will batch updates by groups of 1000 records.

If you are on an unstable internet connection, you might want to decrease the value. You will send more batches, but each will be smaller in size.

algolia:
  # Send fewer records per batch
  indexing_batch_size: 500

`indexing_mode`

Synchronizing your local data with your Algolia index can be done in different ways. By default, the plugin will use the diff indexing mode but you might also be interested in the atomic mode.

`diff` (default)

By default, the plugin will try to be smart when pushing content to your index: it will only push new records and delete old ones insted of overwriting everything.

To do so, we first need to grab the list of all records residing in your index, then comparing them with the one generated locally. We then delete the old records that no longer exists, and then add the newly created record.

The main advantage is that it will consume very few operations in your Algolia quota. The drawback is that it will put your index into an inconsistent state for a few seconds (records were deleted, but new one were not yet added). Users doing a search on your website at that time might have incomplete results.

`atomic`

Using the atomic indexing mode, your users will never search into an inconsistent index. They will either be searching into the index containing the old data, or the one containing the new data, but never in an intermediate state.

To do so, the plugin will actually push all data to a temporary index first. Once everything is copied and configured, it will then overwrite the old index with the temporary one.

The main advantage is that it will be completly transparent for your users. The drawback is that it will consume much more operations as you will have to push all your records to a new index each time.

Thanks

Thanks to Anatoliy Yastreb for a great tutorial on creating Jekyll plugins.

Languages

Ruby 65.1%

SCSS 14%

JavaScript 8.6%

CSS 7.1%

Pug 3.9%

Other 1.2%

README.md

Jekyll Algolia Plugin

⚠ Unreleased beta version

Usage

Installation

Basic configuration

⚠ Other, unsecure, method ⚠

How it works

Advanced configuration

nodes_to_index

extensions_to_index

files_to_exclude

settings

indexing_batch_size

indexing_mode

diff (default)

atomic

Thanks

`nodes_to_index`

`extensions_to_index`

`files_to_exclude`

`settings`

`indexing_batch_size`

`indexing_mode`

`diff` (default)

`atomic`