sharpetronics/jekyll-algolia

Fork 0

Go to file

Pixelastic 7f606d3a85 Update Rubocop

2017-11-17 16:45:50 +01:00

docs

Update README and remove old files

2017-11-17 13:47:10 +01:00

errors

Add helpful error messages for each known error

2017-11-17 12:08:19 +01:00

lib

Add helpful error messages for each known error

2017-11-17 12:08:19 +01:00

scripts

Make tests work on all ruby versions

2017-11-15 10:00:41 +01:00

spec

Update Rubocop

2017-11-17 16:45:50 +01:00

.coveralls.yml

test(coveralls): Add coverall files

2015-07-16 18:12:34 +02:00

.gitignore

fix(v3): Stop throwing deprecation warning when using Jekyll v3

2016-01-11 09:47:58 +01:00

.rspec

test(rspec): Start adding rspec tests

2015-07-02 11:05:23 +02:00

.rubocop.yml

Consider focused tests as linting errors

2017-11-14 17:08:32 +01:00

.travis.yml

Have tests working for all supported Ruby versions

2017-11-07 15:56:23 +01:00

CONTRIBUTING.md

Rename files from algoliasearch-jekyll to jekyll-algolia

2017-11-07 16:46:46 +01:00

Gemfile

Refactoring skeleton, by splitting into specific classes

2017-11-07 22:07:46 +01:00

Guardfile

Only reload changed tests, not the whole suite

2017-11-10 18:00:30 +01:00

jekyll-algolia.gemspec

Update Rubocop

2017-11-17 16:45:50 +01:00

LICENSE.txt

feat(jeweler): Add Jeweler

2015-07-16 12:02:59 +02:00

Rakefile

Make tests work on all ruby versions

2017-11-15 10:00:41 +01:00

README.md

Add note about it being still beta

2017-11-17 15:56:53 +01:00

README.md

Jekyll Algolia Plugin

Jekyll plugin to automatically index your content into Algolia.

⚠ Unreleased beta version

This plugin is not yet released on Rubygems. If you want to try it, you should clone the repository and then update your Gemfile to point to the path on disk like this:

group :jekyll_plugins do
  gem "jekyll-algolia", :path => "/path/to/the/cloned/repo"
end

Feedback very welcome!

Usage

$ jekyll algolia

This will push the content of your Jekyll website to your Algolia index.

Installation

The plugin requires a minimum version of Jekyll of 3.6.2 and a Ruby version of 2.2.8 (which are the current versions deployed on GitHub Pages at the time of writing).

First, add the jekyll-algolia gem to your Gemfile, in the :jekyll_plugins section.

If you do not yet have a Gemfile, here is the minimal content to get your started. You will also need Bundler to be able to use the Gemfile.

source 'https://rubygems.org'

gem 'jekyll', '~> 3.6'

group :jekyll_plugins do
  gem 'jekyll-algolia'
end

Once this is done, download all dependencies with bundle install.

If everything went well, you should be able to run jekyll help and see the algolia subcommand listed.

Basic configuration

Add your Algolia credentials under the algolia section of your _config.yml file like this:

algolia:
  application_id: 'your_application_id'
  index_name:     'your_index_name'

If you don't yet have an Algolia account, you can open a free Community plan here. If you already have an account, you can get your credentials from your dashboard.

Your API key will be read from the ALGOLIA_API_KEY environment variable. You can define it on the same line as your command, allowing you to type ALGOLIA_API_KEY='your_api_key' jekyll algolia.

⚠ Other, unsecure, method ⚠

You can also store your API key in a file named _algolia_api_key, in your source directory. If you do this we very, very, very strongly encourage you to make sure the file is not tracked in your versioning system.

How it works

The plugin will work like a jekyll build run, but instead of writing .html files to disk, it will push content to Algolia.

It will split each page of your website into small chunks (by default, one per <p> paragraph) and then push each chunk as a new record to Algolia. Splitting records that way yields a better relevance of results even on long pages.

The placement of each paragraph in the page heading hierarchy (title, subtitles through <h1> to <h6>) is also taken into account to further improve relevance of results.

Each record will also contain metadata about the page it was extracted from (including slug, url, tags, categories, collection and any custom field added to the front-matter).

Every time you run jekyll algolia, a full build of the website is run locally, but only records that were changed since your last build will be updated in your index.

Advanced configuration

The plugin should work out of the box for most websites, but there are options you can tweak if needed. All the options should be added under the algolia section of your _config.yml file.

`nodes_to_index`

By default, each page of your website will be split into chunks based on this CSS selector. The default value of p means that one record will be created for each <p> in your generated content.

If you would like to index other elements, like <blockquote>, <li> or a custom <div class="paragraph">. If so, you should edit the value like this:

algolia:
  # Also index quotes, list items and custom paragraphs
  nodes_to_index: 'p,blockquote,li,div.paragraph'

`extensions_to_index`

By default, HTML and Markdown files will be indexed. If you are using another markup language (such as AsciiDoc or Textile, then you should overwrite this option.

algolia:
  # Also index AsciiDoc and Textile files
  extensions_to_index: 'html,md,adoc,textile'

`files_to_exclude`

The plugin will try to be smart in the pages it should not index. Some files will always be excluded from the indexing (static assets, custom 404 and pagination pages). Others are handled by the files_to_exclude option.

By default it will exclude all the index.html and index.md files. Those files are usually not containing much text (landing pages) or containing redundant text (latest blog articles) so we decided to exclude them by default.

If you actually want to index those files, you should set the value to an empty array.

algolia:
  # Actually index the index.html/index.md pages
  files_to_exclude: []

If you want to exclude more files, you should add them to the array:

algolia:
  # Exclude more files from indexing
  files_to_exclude:
    - index.html
    - index.md
    - excluded-file.html
    - /_posts/2017-01-20-date-to-forget.md

`settings`

By default the plugin will configure your Algolia index with settings tailored to the format of the extracted records. You are of course free to overwrite them or configure them as best suits your needs. Every option passed to the settings entry will passed to a call to set_settings.

For example if you want to change the HTML tag used for the highlighting, you can overwrite it like this:

algolia:
  settings:
    highlightPreTag: '<em class="custom_highlight">'
    highlightPostTag: '</em>'

`indexing_batch_size`

The Algolia API allows you to send batches of changes to add or update several records at once, instead of doing one HTTP call per record. The plugin will batch updates by groups of 1000 records.

If you are on an unstable internet connection, you might want to decrease the value. You will send more batches, but each will be smaller in size.

algolia:
  # Send fewer records per batch
  indexing_batch_size: 500

`indexing_mode`

Synchronizing your local data with your Algolia index can be done in different ways. By default, the plugin will use the diff indexing mode but you might also be interested in the atomic mode.

`diff` (default)

By default, the plugin will try to be smart when pushing content to your index: it will only push new records and delete old ones insted of overwriting everything.

To do so, we first need to grab the list of all records residing in your index, then comparing them with the one generated locally. We then delete the old records that no longer exists, and then add the newly created record.

The main advantage is that it will consume very few operations in your Algolia quota. The drawback is that it will put your index into an inconsistent state for a few seconds (records were deleted, but new one were not yet added). Users doing a search on your website at that time might have incomplete results.

`atomic`

Using the atomic indexing mode, your users will never search into an inconsistent index. They will either be searching into the index containing the old data, or the one containing the new data, but never in an intermediate state.

To do so, the plugin will actually push all data to a temporary index first. Once everything is copied and configured, it will then overwrite the old index with the temporary one.

The main advantage is that it will be completly transparent for your users. The drawback is that it will consume much more operations as you will have to push all your records to a new index each time.

Thanks

Thanks to Anatoliy Yastreb for a great tutorial on creating Jekyll plugins.

Languages

Ruby 65.1%

SCSS 14%

JavaScript 8.6%

CSS 7.1%

Pug 3.9%

Other 1.2%

README.md

Jekyll Algolia Plugin

⚠ Unreleased beta version

Usage

Installation

Basic configuration

⚠ Other, unsecure, method ⚠

How it works

Advanced configuration

nodes_to_index

extensions_to_index

files_to_exclude

settings

indexing_batch_size

indexing_mode

diff (default)

atomic

Thanks

`nodes_to_index`

`extensions_to_index`

`files_to_exclude`

`settings`

`indexing_batch_size`

`indexing_mode`

`diff` (default)

`atomic`