jTag is a command line application for manipulating Jekyll tags in the YAML headers. It can perform mass operations such as tag addition, removal, merging and sorting. It can also suggest relevant tags based on the content of the post matched against your existing tag set (requires a plugin/template in your Jekyll install).
Configuration includes a persistent blacklist and synonym definitions. Tags manually added to posts are automatically whitelisted for the auto-tagger. The auto-tagger can update your posts directly, or be used to provide suggestions for a working post that you can manually insert in the post.
NOTE: jTag works by modifying the YAML headers of your posts directly. There’s no undo. Back up your posts (which is a good idea anyway). The best way to this is with git1.
The jTag tool is at 0.1.6 at the time of this writing. It’s been tested, but never used by anyone but me until today. I’ll be adding features and updating as I go, so please let me know about bugs.
First, though, let’s cover the plugin side, which hasn’t needed to change since I first wrote it. Set it up and forget it.
- Generating a tag index
Generating a tag index
This plugin is a Page generator which creates
/data/tags.json, which is a list of all the tags on your blog, as well as a list of the top 50 tags with a count of how many times they appear across your blog. You can download the plugin here.
The plugin looks for the
tags_json.html template in your _layouts folder. See the example provided. It provides the following, which can be used in the liquid template:
In your blog’s
_config.yml, you can optionally add rules for excluding tags or using an alternate destination:
tag_excludes:array of tags to exclude from all listings
tags_json_dir:alternate folder for the
data.jsonfile. Set to blank for root, defaults to ‘data’
AddType application/json json ExpiresByType application/json "access plus 0 seconds"
Place the template in
Now you can do a full build of your blog and the
data/tags.json file will be ready for deploy.
jTag is currently at version 0.1.6 It will be a work in progress for a while as I make it more efficient and figure out the best ways to integrate it into my blogging workflow.It grabs all of the tags on your blog (the
tags.json file) and compares them to the text you give it. It handles matching plurals and different conjugations of words so that you get tag results even if they’re not an exact match.
The gem is hosted on rubygems.org. To install, either add the jtag dependency to your Gemfile in your Jekyll folder (
gem "jtag", "~> 0.1.6") and run
bundle, or run
gem install jtag. If you’re not using a ruby version manager, you may need to use
sudo gem install jtag to install it in your system Ruby configuration.
jtag config to ensure it’s loaded and create the configuration folder and files in
The only configuration option required is the location (url without protocol) of your
tags.json file. It’s set in
There are three other files that are generated in
~/.jtag/, and they can be edited as needed:
This is a blacklist. Tags in this list will never show up in automatic results. If it already exists in the post it’s merged in, but it won’t be created. There are some tags that are relatively generic in terminology and would be triggered on almost every post. They’re usually tags that I’d enter myself anyway. For example, I have a tag for Gabe Weatherhead’s show, Generational: “generational.” Because jTag stems the word before scanning for it, anything based off the root “generi” will trigger that tag. Thus, it’s blacklisted and added manually on the infrequent occasions that I’m on the show.
The easiest way to add a tag to the blacklist is to run
jtag blacklist TAGNAME. You can blacklist multiple tags at once, just separate them with a space. The blacklist is stored in
~/.jtag/blacklist.txtand can be edited manually if needed.
This is a predefined list of common words that will most likely never be tags. They’re “stemmed” down to their root, so that “thing,” “things,” and “thingy” are all blocked by one line. If you find your results are missing a particular tag, scan the
~/.jtag/stopwords.txtfile for the culprit.
synonyms.ymlfile is a YAML file that defines alternate spellings, punctuations or related topics which determine when a tag is used. For example, my tag for Mountain Lion is “mountainlion” because I like single-word, lowercase tags. That isn’t going to be found in the text of my posts, though, so I add synonyms like this:
mountainlion: - Mountain Lion - 10.8 - OS X 10.8
You can add as many as you like. When an attached term is found in the text, it will include the parent tag.
Note: jTag requires that YAML headers exist in the post being processed. I may add the option to pass just content text and get back an appropriate auto-tag response, but for now you at least need a yaml
--- header and terminator.
jTag is built using subcommands, each with it’s own options and parameters. You can get help for jTag with
jtag help, and documentation for each command with
jtag help command.
jtag [-t] command [options] [arguments] filename(s) help - Shows a list of commands or help for one command add - Add tags to post(s) remove - Remove tags from post(s) merge - Merge multiple tags into one config - Update and notify user of configuration files location blacklist - Blacklist a specific tag search - List all tags, optionally filter for keywords/regular expressions (boolean OR search) sort - Sort the existing tags for posts tag - Generate a list of recommended tags, optionally updating the file tags - Show the current tags for posts
Use the global
-t option with jTag to run it in “test” mode. No files will be overwritten if the command starts with
jtag -t (command). All results will be written to STDOUT for inspection. If you run it with
-t and pass
-f fmt to relevant commands with a single file argument, it will output just the tag results in your selected format. This makes it possible to script jTag into other applications.
- Use config to install configuration files, replace missing ones, or reset to defaults (–reset).
- Use blacklist to quickly add or remove (run with
-r) a tag from the blacklist.
- Use add on a single file, a glob pattern or the results of a grep, ack or find command (using xargs) to add one or more tags to the specified posts. With this, you can quickly add tags to an individual post, or add a tag to any posts that contain certain words in its content or title.
find source/_posts -name "*marky*" | xargs jtag add markdownifier
- Use remove to remove tags on select files, or run it on an entire folder to purge tags from your system.
- You can use merge to prune your taxonomy. Have two tags that only differ by capitalization or pluralization? Run something like
jtag merge Markdown markdown _posts/*.mdand all instances of “Markdown” will be converted to “markdown,” avoiding duplicates should they arise. You can merge as many tags as you want, the last one in the list will be the one that replaces them all.
- Use search to list all of the tags on your blog, optionally filtering the list with a keyword as the argument. Multiple keywords will be combined in an boolean OR fashion, so any tags that match any of the keywords will be returned. Keywords can also be quoted regular expressions (e.g. “^mark” to find only tags that start with “mark”).
Add the -c option to get tags back with a number representing how many posts they appear in. This command is a tool to help you find the proper punctuation, capitalization and pluralization of existing tags to ensure consistency.
$ jtag search -f list markdown markdown multimarkdown markdownservices markdownifier markdownediting markdownrules
- Use tag to suggest tags for your post based on its content. Suggested tags will be mixed with any existing tags (which are automatically whitelisted) and it will return a YAML array block to add to your post. By default, it will write the tags to the file automatically. If you use the -t global switch (
jtag -t tag post.md) it will just give you the output (in YAML by default) and let you copy/paste it.
- Use tags to see the current tags on a post or posts. They’ll be listed with the filename as a header and the tags below. This is just handy for quickly confirming operations.
- The sort command is a convenience method to alphabetize the tags in your post. It can safely be run on your entire
_postsfolder just to tidy things up.
Any command that outputs a list of tags has the option (
-f, --format)to change the format to one of
list (plain text),
yaml (default). Hopefully this will provide some scripting and integration options in the future.
If you’re interested, there’s also a quick script you can drop into
.bash_profile and get jtag tab-completion in Bash.
If you’re not already using git to manage your blog (i.e. not deploying via git), just add the non-public parts of the directory (the source files) to a repo and make a commit when you finish post. A local repo would be fine in most cases, but a remote repo offers a little extra peace of mind. Mine is set up so that a gen_deploy Rake task automatically creates a snapshot every time I publish, in addition to my own commits. ↩
Speaking of jTag…